Overview
Listening to the same songs over and over again can get boring. I wanted to be able to tweak the sounds of my favourite songs to keep them feeling fresh. I find that something as simple as a pitch shift or tempo change can make a song feel new again, and so I set out to build a Chrome extension to do just that. Importantly, I aimed to build this extension quickly, as I decided that spending too much time learning the intricacies of audio processing would not be the best use of my time.
Tech Stack
- HTML: For the structure of the extension
- CSS: For the styling of the extension
- JavaScript: For the functionality of the extension
Features
- Sliders to change the pitch and tempo of the video
- A full suite of EQ settings
- A quick reset button to reset the sound to the original values
First Steps
I began by building a simple interface with sliders to which I would map the audio manipulation functions. In the first couple of hours, and after researching live audio manipulation, I managed to successfully implement a bass boosting function from scratch. I was able to do this by using the Web Audio API to manipulate the audio stream in real-time. Problems arose when I tried to implement the pitch shifting function. I found that the Web Audio API didn’t support pitch shifting, and so I had to find a different approach.
As a developer not familiar with audio processing, I decided that making use of a library would be the best course of action following my roadblock with the Web Audio API. At this point, it remained challenging to decide on a library to implement due to limited documentation and the fact that my unfamiliarity with audio processing made it difficult to understand the nuances of each library, and would likely present issues with debugging down the line.
I decided to search the Chrome Extension Marketplace for extensions which offered live audio manipulation. Quickly, I discovered Transpose, which I found to be a great starting point. I was able to use the code from the extension to help me understand the process of live audio manipulation, and I was able to use the code to help me implement the pitch shifting function.
Challenges
The codebase for Transpose was significantly larger than I had anticipated, containing well over 20,000 lines of code. I discovered that Transpose makes use of SoundTouch, a library for real-time audio processing. Now knowing that SoundTouch was able to be successfully implemented for a use case identical to my own, I decided that it would be a safe bet to use it for my own extension. Before proceeding with implementing SoundTouch in my own extension, I dove further into the codebase of Transpose to understand the process of live audio manipulation, specifically the process of pitch shifting. To my surprise, I found that, despite having many references to the SoundTouch library, Transpose still contained thousands of lines of complex custom code to handle the pitch shifting process.
Confused, I continued to read through the codebase in attempt to determine whether this sea of complex code was necessary in the process of pitch shifting. Aha! I found that the codebase made use of a custom audio worklet, which was necessary to avoid the issues I experienced earlier in development with the Web Audio API. Further investigation of the codebase revealed that proceeding with my original idea of building the extension from scratch would not be feasible.
Solution
I searched the Chrome Extension Marketplace once more in attempt to find alternative extensions to Transpose. I wasn’t too keen to take the same approach as Transpose which involved over 20,000 lines of code, so I was eager to find an alternative with a simpler codebase I could feasibly understand and implement.
Unfortunately, I was unable to find an alternative extension which offered live audio manipulation including pitch shifting as gracefully as Transpose. Alternatives had issues with audio quality, latency, and media stream buffering. Dismantling and recycling individual functions from Transpose was impractical due to high coupling, and the codebase was too complex to understand and implement from scratch.
It was at this point I decided that I would fork Tranpose and replace some of its features with my own.
Enhancements
Forking Transpose allowed me to bypass the complexity of writing a complete audio engine from scratch, but I was eager to add some of my own features to the extension in line with my original goals.
The first feature I added was bass boost. Following my earlier success with implementing a bass boosting function from scratch, I was able to implement a bass boost function in a similar manner with ease. I used a simple biquad filter node to boost the bass frequencies of the audio stream.
Next came reverb. Transpose didn’t implement spatial effects, and I wanted to simulate room ambience for a more immersive sound. I generated an impulse response buffer dynamically and routed the processed audio through a ConvolverNode. This allowed me to add a reverb effect to the audio stream. Getting this feature right was important to me, as a friend of mine enjoys “slowed+reverb” mixes of his favourite songs, so I wanted this extension to work perfectly for him.
I then added delay, which proved more difficult due to synchronisation concerns. I implemented a DelayNode alongside a feedback loop using a GainNode, carefully managing the gain to avoid clipping or infinite feedback. The result was a controllable delay effect where users could manipulate both delay time and feedback levels in real time.
Lastly, I implemented quick presets. Being able to quickly set a song to a specific set of effects was an obvious feature to add. Now, my friend could quickly transform any song into a “slowed+reverb” mix at the click of a button!
Reflections
This project was a great learning experience for me. I learned a lot about audio processing and the Web Audio API, and I was able to build a Chrome extension that I’m proud to share with others. I did encounter many challenges along the way, but I didn’t let them stop me from achieving my goal. I certainly would have preferred to have built this extension from scratch, but I’m more than happy with the result I was able to achieve in the time I had allocated to this project.
The codebase of Transpose was a double edged sword; on one hand, I learned a lot from it, and on the other hand, it was clearly a codebase not written with maintainability in mind. I found that the codebase was difficult to understand, was not modular, and was not written with best practices in mind. This served as a reminder to me of the importance of writing maintainable code.