Identifying a Tone (Sine Wave) in iOS with the Accelerate Framework

Introduction

If you just want the code:

ToneReceiver.m – The finished code for this post

Handshake – The complete project

I found a lot of people had the same question that I did, “How do you identify a frequency in <insert language here>?” The answer was usually the same: use a fast Fourier transform. It’s even built in to iOS.

Fast Fourier Transforms

While I’ve learned more physics programming for my phone than I used in college, I did have physics classes. Fast Fourier transforms weren’t mentioned in any of my math classes because they’re primarily used in electrical engineering.

The Wikipedia entry lost me. I figured out how to use them before I reasoned how they (probably) work. I figured out that with the Accelerate framework I could use vDSP_fft_zrip with an array of samples to get an array of intensities at particular frequency ranges. The maximum value in that array corresponds to the strongest frequency range.

A quick note on the Accelerate framework

The Accelerate framework has a number of functions for digital signal processing. For speed, these are C functions which must be bridged so they can be used from Objective C. The fast Fourier transform functions are just some of what the framework provides. There are functions for taking integrals, derivatives, the Fourier transforms I’m using, and other processing that I’ve yet to learn.

Packing the Data

The Fourier functions all operate on real, complex arrays and the microphone provides real data.

//Convert the microphone data to real data
float *samples = malloc(numSamples * sizeof(float));
vDSP_vflt16((short *)inSamples, 1, samples, 1, numSamples);

//Convert the real data to complex data
// 1. Populate *window with the values for a hamming window function
float *window = (float *)malloc(sizeof(float) * numSamples);
vDSP_hamm_window(window, numSamples, 0);

// 2. Window the samples
vDSP_vmul(samples, 1, window, 1, samples, 1, numSamples);
      
//3. Define complex buffer
COMPLEX_SPLIT A;
A.realp = (float *) malloc(halfSamples * sizeof(float));
A.imagp = (float *) malloc(halfSamples * sizeof(float));
      
// Pack samples:
vDSP_ctoz((COMPLEX*)samples, 2, &A, 1, numSamples/2);

vDSP_fft_zrip is an in place function, so the number of frequency ranges is exactly the same as the number of samples fed in, which works best if everything is a power of two. The iPhone takes 44,100 samples per second and 1024 is a nice power of two. So for 1/43 of a second I can identify the 43 Hz bucket that has the strongest frequency. Not good enough for a tuner, but good enough to communicate. If I wanted a greater resolution I could just take more samples. For this proof of concept, a 43 Hz resolution is enough.

// Setup the FFT
// 1. Setup the radix (exponent)
int fftRadix = log2(numSamples);
int halfSamples = (int)(numSamples / 2);
// 2. And setup the FFT
FFTSetup setup = vDSP_create_fftsetup(fftRadix, FFT_RADIX2);

And at the heart of the function, perform the fast Fourier transform.

// Perform a forward FFT using fftSetup and A
// Results are returned in A
vDSP_fft_zrip(setup, &A, 1, fftRadix, FFT_FORWARD);
      
// Convert COMPLEX_SPLIT A result to magnitudes
float amp[numSamples];
amp[0] = A.realp[0]/(numSamples*2);
      
// Find the max
int maxIndex = 0;
float maxMag = 0.0;
      
// We can't detect anything reliably above the Nyquist frequency
// which is bin n / 2 .
for(int i=1; i maxMag)
   {
      maxMag = amp[i];
      maxIndex = i;
   }
}

Recording

Apple provided the delegate which is called when the audio buffer was full. This class just has to implement AVCaptureAudioDataOutputSampleBufferDelegate.

-(void)start
{
   AVAudioSession *session = [AVAudioSession sharedInstance];
   [session setActive:YES error:nil];
   
   self.captureSession = [[AVCaptureSession alloc] init];
   AVCaptureDevice *device = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeAudio];
   AVCaptureDeviceInput *input = [AVCaptureDeviceInput deviceInputWithDevice:device error:NULL];
 
   [self.captureSession addInput:input];
   
   AVCaptureAudioDataOutput *output = [[AVCaptureAudioDataOutput alloc] init];
   dispatch_queue_t queue = dispatch_queue_create("Sample callback", DISPATCH_QUEUE_SERIAL);
   [output setSampleBufferDelegate:self queue:queue];
   [self.captureSession addOutput:output];
   
   [self.captureSession startRunning];
}

And what gets called when the buffer is full:

- (void)captureOutput:(AVCaptureOutput *)captureOutput
didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer
       fromConnection:(AVCaptureConnection *)connection

And a Note on a Hack

I had a problem getting the sample buffer to be consistent. The first time I started recording I’d get 4096 samples, and all subsequent times I’d get 1024. For consistency I implemented a bit of a hack.

- (void)totalHackToGetAroundAppleNotSettingIOBufferDuration
{
   self.captureSession = [[AVCaptureSession alloc] init];
   AVCaptureDevice *device = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeAudio];
   AVCaptureDeviceInput *input = [AVCaptureDeviceInput deviceInputWithDevice:device error:NULL];
   
   [self.captureSession addInput:input];
   
   AVCaptureAudioDataOutput *output = [[AVCaptureAudioDataOutput alloc] init];
   dispatch_queue_t queue = dispatch_queue_create("Sample callback", DISPATCH_QUEUE_SERIAL);
   [output setSampleBufferDelegate:self queue:queue];
   [self.captureSession addOutput:output];
   
   [self.captureSession startRunning];
   [self.captureSession stopRunning];
}

Sending the Result

It’s kind of anticlimactic after the trouble of figuring everything out sending the result back is a perfect case for a simple delegate.

NSNumber* toSend = [[NSNumber alloc] initWithInt:maxIndex];
      
if (self.delegate)
{
   [self.delegate didReceiveTone:toSend];
}

Where I Screwed Up

I screwed up enough making this class that I have enough content for a blog post on its own. So it’s going to get one.

Future Work

This code identifies the maximum frequency bucket across the entire range that the iPhone can receive. For a proof of concept that’s fine, but in production I’d likely look for local maximums across the frequency range that I was interested in and compare those to the maximums over the rest of the data.

I’d also like to remove that hack, but again this is a proof of concept.

Playing a Pure Tone (Sine Wave) in IOS

Introduction

If you just want the code:

ToneGenerator.m – The finished class for this post

HandShake – The complete project

Most programming languages have some variant of: Beep(frequency, duration);

Objective C does not. Part of this is the nature of iOS devices: an interruption can happen at any time meaning there’s no way to guarantee that the call will have enough time to complete. To produce a tone on demand, the programmer must fill the audio buffer with the tone data and the device will play the data when it can.

Audio Player Setup

I got most of the setup code from http://christianfloisand.wordpress.com/2013/07/30/building-a-tone-generator-for-ios-using-audio-units/. An explanation is also provided there so there’s no need to restate it here.

Rendering the Audio Data

Forgetting my high school physics completely, I expected a constant frequency to need constant audio data. I expected a tone of 440 Hz to look something like, “[440, 440, 440, 440, 440]“.

A pure sound is a pure wave. Generating a wave is easy using the sine function. I started with code from http://www.cocoawithlove.com/2010/10/ios-tone-generator-introduction-to.html, but had to update it for portability (see “Where I Screwed Up #2″).

OSStatus RenderTone(
                    void *inRefCon,
                    AudioUnitRenderActionFlags 	*ioActionFlags,
                    const AudioTimeStamp *inTimeStamp,
                    UInt32 inBusNumber,
                    UInt32 inNumberFrames,
                    AudioBufferList *ioData)

{
	// Fixed amplitude is good enough for our purposes
	const double amplitude = 1;
   
	// Get the tone parameters out of the class
	
        ToneGenerator *toneGenerator = (__bridge ToneGenerator*)inRefCon;
        double theta = toneGenerator->_theta;
        double frequency = toneGenerator->_frequency;
	
	double theta_increment = 2.0 * M_PI * frequency / SAMPLE_RATE;
   
	// This is a mono tone generator so we only need the first buffer
	const int channel = 0;
	Float32 *buffer = (Float32 *)ioData->mBuffers[channel].mData;
	
	// Generate the samples
	for (UInt32 frame = 0; frame < inNumberFrames; frame++)
	{
		buffer[frame] = sin(theta) * amplitude;
		
		theta += theta_increment;
		if (theta > 2.0 * M_PI)
		{
			theta -= 2.0 * M_PI;
		}
	}
	
	// Store the theta back in the object
	toneGenerator->_theta = theta;
   
	return noErr;
}

Originally I didn’t save theta as a class variable. The result was that the wave never completed a cycle and instead of a beep it sounded like a click.

Playing the Tone

The class now has enough in it to start playing a tone.

   [self.audioSession setActive:true error:nil];
   self.frequency = frequency;
   // Create the audio unit as shown above
   [self createToneUnit];
   
   // Start playback
   AudioOutputUnitStart(_toneUnit);

To keep track of time, I cheated and just slept the thread using [NSThread sleepUntilDate:date].

To stop playing all that is necessary is just tearing down the toneUnit.

   self.frequency = 0;
   self.theta = 0;
   AudioOutputUnitStop(self.toneUnit);
   AudioUnitUninitialize(self.toneUnit);
   AudioComponentInstanceDispose(self.toneUnit);
   self.toneUnit = nil;

For completeness, the stop handler and interruption handlers should do the same thing.

Where I Screwed Up #1

When I settled on my naive encoding, I figured I’d just assign a tone per bucket based on the ASCII value of the character to send. Since I wanted them all to be inaudible I’d make sure all sent tones were above 19 kHz.
frequency = 19000 + (43 * (int)charToSend);

I didn’t realize until later that the lowest ASCII value I was sending was the ‘.’ character with a corresponding frequency of 20,978 kHz. As far as I can tell, that’s 978 Hz above the rated max of my iPhone speakers. It actually still works for values under ’5′ (ASCII 53) but I shouldn’t expect that.

It works for the proof of concept though. Production uses would require a better encoding scheme— for this and other reasons.

Where I Screwed Up #2

A lot of the code samples I used were before automatic reference counting. In the beginning the code crashed after 1/43 of a second (one audio frame at 44100 Hz).

Through some trial and error I managed to not get through one entire tone, but all future tones wouldn’t play correctly— my cats really hated these problems.

Automatic reference counting turned out to be the culprit. I wasn’t correctly casting the ToneGenerator in the RenderTone function so theta wasn’t properly saved and new tone units would not use fresh variables. The solution turned out to be the __bridge cast in the RenderTone function.

Resources

I found a number of helpful blog posts:

Where’d Matt Go?

When I started this blog, I intended to alternate between process reflections and technical posts. That fell apart.

The falling was precipitated by a team assignment change. I didn’t change companies— I didn’t have to— but as often happens in a software company I changed technology stacks. I went from a web team on the Microsoft stack to an iOS team writing in Objective-C.

Very few fields exemplify “learn or die” as much as a tech company. I’m always happy to learn a new skill but the time had to come from somewhere. Instead of reflecting I binge-watched Stanford’s iOS lecture series. Instead of pushing the limits of CSS I was spent writing practice iPhone apps. When I just about had a handle on Objective-C, Apple made an unexpected announcement.

It wasn’t until I watched Google I/O that I realized I learned enough to write a demo app. Google announced an innovation that seemed worth porting to iOS: pairing devices using nearly ultrasonic tones.

To prevent what happened last time, I’m calling my shot:

  1. Part 1: Playing a pure tone (sine wave) in iOS
  2. Part 2: Identifying a frequency in iOS using the Accelerate Framework
  3. Part 3: Troubleshooting EXC_BAD_ACCESS and memory leaks in XCode
  4. Part 4: Putting it all together

The writing for computers is done. Writing for people soon to come.

Are Deadlines Agile?

I had two weeks. I hadn’t fallen behind; I didn’t procrastinate. That was all the time I was given to work on the final assignment— a rather interesting project at that— for my data structures class.

There were classic answers to the problems given in the assignment: this should be a class, that should be a list of that class, an so on. It wasn’t a poorly designed project. There was a right way to do it, but my grade didn’t depend on doing what was right.

I got an A by breaking the rules.

Finishing that assignment started a habit I spent a long time breaking. Some of the worst code I’ve written or the decisions I regret the most came from the pressure to meet a deadline. Though I’ve gotten better, taking the fast route is still a temptation. I’m not alone there either: when deadlines approach others try to cut corners with me in quality, process, or features with me to make a ship date.

With that potential to cause harm, do deadlines fit in with agile processes? Are deadlines agile?

There’s not a guide to dictate if something is or is not agile. There is a good set of principals, however, laid out in the Agile Manifesto.

Individuals and interactions over processes and tools

By definition deadlines are part of a process: they are the end of it. They are also a tool that is often misused. The Agile Manifesto lists points as a core set of values and the value here is that the process should fit the individuals, not that the individuals should conform the process.

Agile or Not Agile: Not Agile

Working software over comprehensive documentation

As painful as it is to admit, software that ships often does work. My grader in data structures didn’t care that I threw out the class material; I was graded on the fact that my assignment gave the right outputs for the given inputs. Customers often have the same, singular care.

Agile or Not Agile: Agile

Customer collaboration over contract negotiation

Customers can’t use software that hasn’t shipped. A properly stated deadline can also help the development team and the customer have an informed conversation and, for example, decide together if that last feature is really needed now or if it can wait until the next release.

Agile or Not Agile: Agile

Responding to change over following a plan

Having a date that will not change is not responsive. Moving to kanban from scrum taught me that arbitrary time boxes don’t respond to a changing environment either. Since we can’t predict the future our plans can’t account for everything that inevitably happens.

Agile or Not Agile: Not Agile

Final score: 2 to 2

A tie.

Darn.

Pulsar Screensaver

I miss screensavers. They're not really necessary with modern displays, but I still install XScreensaver on new computers. When I was looking for projects to practice HTML5 and CSS3 features on, I decided to try implementing some of my favorites. For now I'm limiting myself to plain old JavaScript, that is no frameworks, to make things more interesting.

I decided to start with the pulsar screensaver. It's five or six gradient planes that rotate around different axes.

Build One Object

Filling a square with a gradient is pretty straightforward: just choose six values between 0 and 256 for the red, green, and blue, start at the top left of the square and go to the bottom right.

function randomGadient() { 
  var demo1 = document.getElementById("gradient1");
   var colors = new Array();
   for (var i=0; i < 6; i++) {
      colors[i] = Math.floor( Math.random() * 256 );
   }
   var style = "background-image: linear-gradient(45deg, rgb(" + colors[0] +"," + colors[1] + "," + colors[2] + ") 0%, rgb(" + colors[3] +"," + colors[4] + "," + colors[5] + ") 100%);";
   demo1.style.cssText = style;
}

The results are:



That's not what I was expecting.

I reloaded the page a few times but the random gradients looked flat and were not nearly as bold as the original. Then I noticed that the original had a color stop in the middle. That was easy enough to add, I just had to choose three more colors and stop halfway in the image.

function randomGradientWithStop() {
   var demo2 = document.getElementById("gradient2");
   var colors = new Array();
   for (var i=0; i < 9; i++) {
      colors[i] = Math.floor( Math.random() * 256 );
   }
   var style = "background-image: linear-gradient(45deg, rgb(" + colors[0] +"," + colors[1] + "," + colors[2] + ") 0%, rgb(" + colors[3] +"," + colors[4] + "," + colors[5] + ") 50%, rgb(" + colors[6] +"," + colors[7] + "," + colors[8] + ") 100%);";
   demo2.style.cssText = style;
}


That wasn't much better. Taking a closer look, I realized that all of the gradients in the original pulsar were the same. That would help clean up the JavaScript. One quick trip to the Ultimate CSS Gradient Generator and I had the CSS I needed.

.gradient {
   background: #ff0c0c; /* Old browsers */
   background: -moz-linear-gradient(-45deg,  #ff0c0c 26%, #30ff3e 50%, #26f23e 51%, #1e2dff 79%); /* FF3.6+ */
   background: -webkit-gradient(linear, left top, right bottom, color-stop(26%,#ff0c0c), color-stop(50%,#30ff3e), color-stop(51%,#26f23e), color-stop(79%,#1e2dff)); /* Chrome,Safari4+ */
   background: -webkit-linear-gradient(-45deg,  #ff0c0c 26%,#30ff3e 50%,#26f23e 51%,#1e2dff 79%); /* Chrome10+,Safari5.1+ */
   background: -o-linear-gradient(-45deg,  #ff0c0c 26%,#30ff3e 50%,#26f23e 51%,#1e2dff 79%); /* Opera 11.10+ */
   background: -ms-linear-gradient(-45deg,  #ff0c0c 26%,#30ff3e 50%,#26f23e 51%,#1e2dff 79%); /* IE10+ */
   background: linear-gradient(135deg,  #ff0c0c 26%,#30ff3e 50%,#26f23e 51%,#1e2dff 79%); /* W3C */
   filter: progid:DXImageTransform.Microsoft.gradient( startColorstr='#ff0c0c', endColorstr='#1e2dff',GradientType=1 ); /* IE6-9 fallback on horizontal gradient */
}

That was not the first or last time I would make something more complicated than it needed to be.

Do Something

Rotation is available around any axis using a CSS 3D transform. The functions rotateX, rotateY, and rotateZ are values of the transform and -webkit-transform CSS properties and can rotate an HTML object around the X, Y, or Z axis. The rotation functions each take values in degrees and can go beyond 360--- something which will come in handy later.

The rotation can be animated by applying a CSS transition to the transform. To make things simple, the transition property can be applied to all CSS properties.

The final CSS:

   transition: all 4s;
   -webkit-transition: all 4s;

The final JavaScript:

var rotateDemoX = 0;
function rotateX(id) {
   var rotate = document.getElementById(id);
   rotateDemoX = rotateDemoX + 180;
   rotate.style.cssText += ";transform: rotateX(" + rotateDemoX + "deg);-webkit-transform: rotateX(" + rotateDemoX  + "deg);";
}


Repeat

At first I dynamically created six divs to use as my rotating planes, then I realized that I was making things more complicated than I needed to--- again. I ended up writing the planes in HTML and using assigning a negative margin-top to each so they'd end up on top of each other.

   <div id="p0" class="gradient"></div>
   <div id="p1" class="gradient" style="margin-top: -70%"></div>
   <div id="p2" class="gradient" style="margin-top: -70%"></div>
   <div id="p3" class="gradient" style="margin-top: -70%"></div>
   <div id="p4" class="gradient" style="margin-top: -70%"></div>
   <div id="p5" class="gradient" style="margin-top: -70%"></div>

The rotateX, rotateY, and rotateZ JavaScript functions above don't work well when applied simultaneously; the cssText property gets overwritten. That's easy enough to fix though with the rotateAll function below.

Note: the xCol, yCol and zCol are just constants to increase readability.

function rotateAll(p) {
   var rotate = document.getElementById('p' + p)
   rotate.style.cssText += ";transform: rotateX(" + planes[p][xCol] + "deg) 
                                        rotateY(" + planes[p][yCol] + "deg) 
                                        rotateZ(" + planes[p][zCol] + "deg);
                     -webkit-transform: rotateX(" + planes[p][xCol] + "deg) 
                                        rotateY(" + planes[p][yCol] + "deg) 
                                        rotateZ(" + planes[p][zCol] + "deg);";
}

Defining a random rotation then is fairly straightforward.

function rotatePlane(p) {
   planes[p][xCol] +=  Math.floor( Math.random() * 720 );
   planes[p][yCol] +=  Math.floor( Math.random() * 720 );
   planes[p][zCol] +=  Math.floor( Math.random() * 720 );
   rotateAll(p);
   setTimeout('rotatePlane(' + p + ');', 4000);
}

The last line allows for a continuous rotation. Since the transitions have a fixed duration, that is all the delay that is needed before the function recurses.

A setup function called on page load completes the screensaver.

function setupDemo3() {
   planes = new Array();
   for (i = 0; i < numPlanes; i++) {
      planes[i] = new Array();
      planes[i][xCol] = 0;
      planes[i][yCol] = 0;
      planes[i][zCol] = 0;
      rotatePlane(i);
   }
}

What Did I Learn?

  • Random gradients don't work.
  • Between writing the demo and publishing it, Chrome updated to use a newer version of the CSS 3 standard.Originally the rotation functions could be applied in series, that is I could rotateX(10deg) and then rotateX(10deg) and have the result be a 20 degree rotation. That doesn't work anymore. The fix was to a running total of the rotation along all of the axes. That kept the planes spinning.
  • There was some fighting to get the code in to WordPress. That was resolved with some plugin hunting and should be transferable to the next post.

Blog Launched

Hello.  I’m glad you stopped by.

The blog is live, but there’s still a lot of work to be done.  I’m trying to be agile and I figured it was better to launch a site that worked and iteratively improve it— hopefully quickly.  So you may notice semi-frequent changes to the graphic design while I release my backlog of posts— and that backlog is growing as site updates inspire more to write about.

Given the categories I’m starting with (administrative, soft skills, HTML/CSS/Javascript, and cats) I’m hoping that there’s something here for everyone.

No Surprises

My cat EzriI have a rule that I apply to my cats and to my coworkers. *  It’s one of those rare circumstances when over-generalization actually works.

One of my cats gets medicine every day, and she likes it about as much as you’d expect a cat to— not at all. Every day we go through the same routine: I measure the dose, she runs, I catch her, I take her to the medicine, I give her the dose, she runs again, and we’re done.  After all of that she’ll be waiting for me on the couch, completely unafraid.  She always knows when the bad stuff will happen.

It took me a while to realize how that reasoning could be agile.

“Do you think you surprised them?”  I don’t remember what prompted my manager to ask that question.  I might have been talking about a problem I was having a hard time solving.  I might have relayed some tidbit I heard in a meeting.  I might have mentioned some test case that broke a new feature.

Whatever prompted the question, I did surprise my manager.  Here I had someone on my team who’s primary job it was to reduce risk and I surprised them with bad news.  I didn’t have to do that.

In fact, most teams have someone like this.  They usually have the title of manager.  They are the ones who are expected to know when things will ship, what’s keeping the team from shipping, and what happens if the team doesn’t ship.  They definitely want to know what might be a problem and how likely those problems are.

The question stayed with me.  I had a simple rule about what I should say in status meetings: if I don’t say something now, will I surprise my manager later?  I didn’t want to inundate my manager with trivial, but trivial information wouldn’t be surprising.  Thus the rule has a built in filter.

It was a few years later that a different manager asked me, “I know you don’t want to surprise your project manager.  What about the rest of your team?”

I was a little sad that didn’t occur to me sooner.  It’s not solely the job of people with manager in their title to mitigate risk.  It’s part of everyone’s job on the team, including mine.  Again, I don’t want to go on about problems that aren’t likely, but trivial information isn’t surprising.  It’s a different problem if I can’t figure out what my team thinks is trivial.

So now I try not to surprise anyone, feline or human.

* What can I say: cats get page views.