Visual+Gestural musical interface

The strength of this project is in the collaboration. I was sure I wanted to make some form of a visual musical interface or instrument for non musicians, but without any understanding of music theory, I needed to partner with someone who does. I met two students who had similar ideas around using a glove based gestural interface. We got together and spent a week ideating on possible interfaces with the given controls.

Being the primary target user without any formal knowledge of music, I had to think of suitable gestural and visual metaphors to model the interface after. I wanted to strike a balance between creating enough structure to constrain the musical output, so that the music doesn’t sound random and incoherent, and some freestyle component so that the user also has a sense of spontaneity in playing melodies. What emerged was a holistic and intuitive musical interface that combines a non-traditional graphic notation with gesture-based wearable controls that is simple and intuitive enough for a non-musician to use comfortably. It is halfway between a traditional step sequencer and Kandinsky from the Chrome Music Lab Experiments. So the step sequencer is where you would loop different instruments and use the drawing program and glove in combination to freestyle with the music.

We decided to use Tone.js library as a primary source for all the sound synthesis. Most challenging yet exciting part of this process was the convergence and divergence of each of our work processes within the team. It was a bit of a catch-22 situation where the visuals and animations were dependant on the sound and the sound was dependent on the visuals. So each of us would go exploring our own ways and build on each others work, step-by-step.

As an early POC of the idea, I put together a basic visual composition to compose a piece of music around. The idea was to prototype the translation of visuals to sound.

Visual design 1

Each color being a separate channel playing a different sound and the shapes dictating the texture and timbre of the sound. The position of the shape on the canvas dictates its frequency and pitch and finally the opacity being the amount of volume.Here’s what it sounds like –

Next, I made a POC of a basic drawing program, with two different oscillators mapped to x,y axes.

For the glove, we narrowed down on using 4 flex sensors (controlled by Adafruit Flora board) sown onto the glove for 4 fingers and either a light sensor or an accelerometer at the palm to modulate a certain sound parameter. We initiated the process for sourcing the BOM for the glove.

POC of interface + Serial control
This one uses a potentiometer in place of flex sensors.

By this time the team was making a lot of progress with understanding and creating sounds with Tone.js. To familiarize myself with the library and test the serial interaction with the sketch, I put together a basic generative music program with sliders to control the tempo with frame rate. As the flex sensor was giving me erratic values in p5, I mapped the slider values to a potentiometer (to simulate the flex sensors on the glove)

I wanted to create different tempo rates for different sound families and control that with a Pot. Changing the tempo seemed to be a good way to easily and quickly add a rich level of expression to the music being created.

Next step was to find suitable visual metaphors to give body to the sound. I also had to think of choosing the right animations for representing the sounds more intuitively.

Interface idea

This was an early visualization of the interface, with three basic geometric shapes and a draw program to the left, each shape produces a certain sound. The glove is used to modulate the sound and add effects to each of the sounds, which then dynamically update in the 10Print generative program to the right. So there is a certain rule system which the computer decides for you, but you too have a vast degree of freedom to modulate the sounds in real time.
But of course, as my team rightly pointed out, the tempo couldn’t have been dependant on the frame rate. We needed the maximum frame rate for animations to run smoothly.
So we decided to add one more instrument to the grid on the left and move the draw program to the right.

Max put together this compiled sketch with placeholder sounds and animations to build on.

I started figuring out the different sound analyzers in the Tone.js library and use that data to animate each shape with.

Circle / Kick
Currently I am using the tone.progress function to get value of the kick loop as it progresses through time. The returned values of which are mapped to the circle in reverse, so it swells to its peak as it hits the kick sound.

Square / Bass
I am using waveform analyzer on the bass. The waveform analyzer returns values as a fat32array. It took me nearly half a day to get this working. At first I tried mapping the array values to a single line but instead of getting a granular visualization of the waveform, I was only able to move a few points on the line, leading to much frustration! I ran the bass sound through a distortion filter as a naive attempt at getting a more pronounced waveform. Eventually I used the array values to make the square vibrate to its waveform. I played around with the array size and skipped a few pixels till I achieved a satisfactory amount of vibration. (1).gif

Dynamic dots / Piano
The piano chords are very rich in sound & the animations had to do justice to the chords. The piano loop has 4 sets of chords, each with 3 notes. I wanted a circle to fade in, each time a chord was played from any given loop. It was very challenging to achieve this as the sound/time component was part of the Tone.js library. Finally, with help from Shivanku, I was able to sync the animation with the chords. I used the Tone.progress function as a global timer and broke it down into three phases. I then initiated a different circle animation to play at each phase. (2)

Paint / lead
The paint program plays different notes as you paint across the screen. The notes are mapped to different zones on the canvas. (4)

This is the revised sketch, the triangle with snare sound is still being figured out.
new sketch –

Parallelly, the glove was also shaping up steadily. The glove uses a simple circuit of 4 flex sensors wired to each of the pins of the Flora board.

Breadboard POC of the flex sensor readings.
test 1

Update: Amena finished sowing everything on the glove. Only thing left is soldering all the connections.
glove 0

Final glove interfacing with the sketch.
glove 1

In the meantime I continued improving on the animations. For the snare / triangle I used the waveform again to slightly shake the triangle as it hit the snare. (3)

This is the final sketch. After testing the sketch on an iPad, the black background was showing too many smudge marks on the screen. So I decided to adopt a whiter background for the screen. The latest sketch is also fully responsive to work on any screen.
sketch –


Project proposal for final project_Fall ’17

I want to make an intuitive electronic music production system for non musicians.

I’ve always wanted to make electronic music but haven’t been able to do so successfully, mainly because I am incapable of auditory thinking and all the serious music production softwares are quite counterintuitive with lots of knobs, buttons and sliders as inputs for control.

I want to create a system which is directed towards people who are more of visual or spatial thinkers than auditory thinkers. It is easier for me to ‘see’ music rather than ‘hear’ it in my head. The system should feel very natural and intuitive and involve multiple senses.

I will be collaborating with Max Horwich and Amena Hayat on this project. Max is deeply immersed in the technical aspect of music and sound and Amena is interested in using soft wearables as control.

So far we have a lot of scattered ideas and we are still in the brainstorming phase but what seems to be emerging is a system which is a combination of a tabletop interface and wearable control. Perhaps the tabletop interface is where you visually design complex rhythm structures and use the wearable for all the fluid instrumentation and melodic compositions.

Phys Comp Midterm

I collaborated with Jiyao Zhang for my mid term project for this class.

Since the midterm presentation was due around the time of halloween we decided to do something around that theme.

After several rounds of discussions we finally arrived at a simple idea of releasing a ghost trapped in a glass jar which tells you your fortune….or rather your misfortune.

We started with initial project plans and early sketches of the ghost!

We wanted the ghost releasing interface to be very intuitive and something that will directly reference our real world. The childhood activity of trapping fireflies in a jar seemed like the perfect metaphor for such an interface.

The jar in which the ghost was to be captured

We decided to put an illuminated PingPong ball at the bottom of the jar to symbolize the ghost. The act of opening the lid was to be used as a switch to trigger the event of releasing the ghost in p5.



New home for the ghost

Unfortunately the plastic jar was lost just 12hrs before the presentation. We were forced to rethink the interface and the third iteration actually turned out to be much better.

We went back to our glass jar and decided to use it inverted without the lid, as if one was trapping an insect crawling on the floor.



we created a switch using copper tape electrodes on the platform. The jar also has copper foil on the rim which completes the circuit when placed on the platform.


The image above shows the jar placed inverted on a platform. The jar has an illuminated ball inside to symbolize a ghost. When the jar is lifted off the platform the illuminated ball turns off, as if the ghost is released and is seen on a screen. The ghost then tells you your misfortune.

Click on the image below to watch the video of the project

click for the p5 code

PC Week 3_Interaction design systems in public

Walmart self checkout

What do you think is the biggest pain point for Walmart’s self checkout system? ……
Correct, ‘Theft!’ It’s relatively easy to scam its simple system of barcode scan and item weight correlation program.There a tons of articles online offering a bouquet of strategies on scamming the self checkout system.

I went to walmart yesterday to experience it firsthand. There are several loopholes in the system and the overall UX seems very scattered. Each item took several minutes to scan and reflect in the shopping cart. This caused a lot of confusion especially while scanning multiple items at once. There is always the fear that you are scanning the same item multiple times and paying double. the whole interface needs to be organised and prioritised. Somehow our machine did not have audio assistance which made things worse. Some items did not even have a barcode and it’s no fun to spend 5 minutes figuring out the item number, bagging it and the machine reverts back with : “Unexpected item in bagging area. Remove item.”

The whole self checkout system UX sure had me thinking of ways to scam it in revenge!


If I were to relook at the fantasy device as a self checkout system I would envision a simple system with minimal UI. It would simply be a large, weight sensitive table, perhaps like the Microsoft surface table on which you lay out all your items. A camera on top scans barcode/product images to correlate with the measured weight and send the items for bagging automatically.



As I was approaching this question, I was wondering if I was walking into an analysis-paralysis situation by over thinking something so simple to answer. Isn’t interaction simply just communication between two people …..? I asserted, and found it pretty much futile to continue questioning further, but the train of thought had already begun.
…..Or is it interaction which sometimes takes place between a person and an object, like a boxer practicing his punches on a boxing bag? Or maybe between two objects, like our heart and lungs working in coordination, to keep the blood flowing and oxygenated….?
When a Mimosa Pudica herb closes its leaf in response to stimuli, is it interactive or merely reactive?
It may seem like new media creatives (myself included) would be quick to conclude ‘Interactive’, after all, if you were to artificially recreate a Mimosa herb using mechatronics, you would know what a long winded process it would be. From setting up the sensors, motors and linkages and programming all of the components to talk to each other to produce the desired result, is a commendable effort. Moreover, there are a plethora of new media projects (with similar interactivity quotient as that of a Mimosa herb) which respond to stimuli, rather quite one-dimensionally but are marketed as ‘Interactive art’, widely exhibited at ‘Interactive media art festivals’ and frequented by ‘Interaction designers and artists’.

So clearly there is a lot of buzz around interactive tech and media, but without a clear understanding of what it really means? In The Art of Interactive Design, Chris Crawford defines interaction as, a conversation between two actors who “listen, think and speak.” Physical interaction involves a give and take, and is not simply a reaction (for example, a viewer watching a movie) or participation (a couple dancing).

So Crawford would clearly dismiss the notion of a Mimosa herb being interactive and I agree with him.

I think it’s OK to declare that in the context of interactive design or interactive media, for something to be interactive, it has to work outside of repetitive instances of mere cause and effect, unless the effect stirs up another cause leading to a different effect. This act of give and take is much more dynamic, engaging and meaningful.

But I also wonder why Crawford chose to use the term ‘actors’ instead of people? Could it imply that the two entities are acting out a role in a given stage (context) where the interaction plays out? A context where there are unique rules, behavior and sign systems? Is an AI machine talking to another AI machine a case of interaction design? Because even though a person (Human) is only an audience to  such an AI driven chat show, it still takes a person (human) to design and engineer it. How then, would a person design such a platform for AI conversation? Should you design it to be decipherable for humans? Instead, what if you invent an entirely new language, much rich in semiotics in computer parlance for AI bots to communicate with several times more efficiently? Does it still hold up to the idea of interaction design?

As you can see, the more you attempt to define it, the more obscure or inadequate the definition starts to feel. It can perhaps only be understood within a certain framework (context) of declarations and objectives.



There isn’t really, because there is perhaps no satisfactory answer. But in the process of trying to define it you understand the boundaries of the concept and the boundary can be thought of as both limiting and expansive. Moreover, you will have some objective parameters against which you can judge your design, and lastly and more importantly, the process of inquiry gives you better informed visions of the future of interactivity!

Bret Victor, in A brief rant on the future of interaction design says, “I believe that our hands are the future!” It’s quite a fun way of saying that interaction design as we know it, is under-utilizing the capabilities of the human body. Our dexterous hands are capable of much complex and nuanced gestures than simply swiping and tapping information on a sheet of glass. Bret has a term for it – ‘ pictures under glass’.

You can think of interactivity as a long string of information exchange at its most fundamental level and if you observe the different ways in which nature presents information, you can imagine a much richer future of interaction design which extends far beyond the screen, incorporating sights, sounds, smells, taste, touch and everything that could appeal to our senses!

This is indeed a beautiful idea and it has been explored previously as well. I appreciate Bret’s essay for putting forth a case for a more organic and immersive interface design, which fully exploits and extends human capabilities but at the same time, it is also frustrating to not find any answers for achieving that level of interactivity. I appreciate the detailed and well documented presentation on the dexterity of the human hand, but all of those gestures are very one dimensional and meant for specific actions.

Let me explain, I think every rosy idea should be tested in practical terms as an early prototype, even if in form of a simple thought experiment. So I wondered; if I were to design an enterprise software, in which the user is presented with several layers of information and has to navigate several pages with multiple buttons to click and drop down menus to select, how much of our sensory or bio-mechanical capability should I exploit to control the interface? Should I design the software to accept as input, the full spectrum of hand gestures, from grips to pinch to scissor to flicks? OR, Should you smell Lavender, every time you get an email? OR Should I engineer my laptop’s enclosure to slouch as it starts losing power………?

No doubt, the whole idea starts to feel quite comical!

In fact, it seems as if dealing with highly complex tasks of information handling and processing requires a much simpler control, a lowest common denominator of input actions.
I wonder if people would want to perform awkward ‘hand/body gymnastics’ for navigating facebook! It’s seems to contradict the idea of interactivity as a tool. For a tool, like a hammer, is meant to have an easy and comforting interface (handle where you grip the tool) and leverage our bio-mechanics with amplified results on the output side (hammerhead).
So perhaps what is needed, is simply a minimal tool box of effortless interactions capable of manipulating a large landscape of information.

I don’t mean to disregard Brets’s ideas, but rather I think we need a paradigm shift in the way we process, display and control information to make organic interactivity possible. Something which has to be fueled by developments across tech, manufacturing, engineering and commerce.

I know I too have raised many questions without providing any answers, but I am going to attempt to do just that, at ITP and beyond!