MIDI controller for Ableton with PoseNet ML effects

Martha Janicki, Gabe Warshaw, Adnan Aga

This was the third and final project for my Introduction to Physical Computing class.

For our final project, we made a MIDI controller for Ableton. The music plays in p5.js with a corresponding visualizer. The user can adjust four effects onto the beat that is playing by moving the wrists for the camera, which is interpreted by PoseNet ML.

The design was a synchronization of two ideas: Gabe's idea to create a physical MIDI controller for his studio, and Adnan and I's idea to create a game similar to the Plink! game which we played in our Intro to Computational Media class.

Design

We knew we wanted a physical controller, integration of PoseNet and visualizer for the audio.

The visualizer has a video at the top which, when an effect was selected, draws two blue spheres at the points of the wrist and runs a timer for 10 seconds during which the user moves their wrists in order to adjust the values of the effect on the beats.

The controller also has four drum beats and four melody loops for the user to play.

Design mock-up of the physical MIDI controller. The top part of the controller has a green Start button, red Stop button, and red Reset button; there are four white buttons for each effect. The bottom part of the controller has four buttons for melodies and four buttons for drums.

Here is the user flow with the tool:

Flow chart of user interaction with the MIDI controller for Ableton with PoseNet effects. When the user turns on the machine and presses Play, a beat and drum sequence play. A visualizer on a computer screen plays a corresponding graphic representation of the music. Pressing one of the effect buttons sends a signal for the X and/or Y coordinates of the wrists through the Arduino to Ableton based on readings from the computer’s webcam; Ableton adjusts the sound of the music based on the input data and, through the Arduino, sends it to the p5 visualizer. Pressing a melody button or drum button sends a signal through Arduino into Ableto to switch the melody or drum track, which then sends back through Arduino into p5 the new track to play. The Stop button pauses the music; the Reset button resets any effects applied onto the beats.

Body Music Machine.pdf

Construction

My contribution in the construction of this project was to build out and map the points in PoseNet, and send the variables to the Arduino through p5.js. I collaborated with my team on determining what motions the user should do in order to control the four effects. I collaborated on the design of the final box. I collaborated on the appearance of the visualizer and provided sketches with the appearance of the final sketch.

The basic setup of PoseNet to identify wrists was straightforward. Once the computer webcam identified enough of the body and therefore the wrists, it was able to plot the points of the wrists.

https://vimeo.com/661225314

However, the points were jumpy and we knew that piping this into Ableton would produce shakiness in the final effect. As a result, I built a smoothing function to the each of the four readings we were interested in mapping (left wrist X value, left wrist Y value, right wrist X value and right wrist Y value) by pushing each wrist X or Y value into an array which passed the threshold of 0.8 confidence value , then averaging the last 20 values read for each. This creates a small amount of lag in the presentation of the readings but creates a much cleaner visualization.

Here is an example for left wrist X-value:

let LeftWristX = 0;
let numReadings = 20;

//LeftWristX variables for smoothing
let readingsLWX = []; // make one of these for each x & y point of each wrist
let readIndexLWX = 0; // the index of the current reading
let totalLWX = 0; // the running total
let averageLWX = 0; // the average that I will want to send to circle readings

for (let thisReading = 0; thisReading < numReadings; thisReading++) {
    readingsLWX[thisReading] = 0;
    readingsLWY[thisReading] = 0;
    }

function gotPoses(poses) {

  if (poses.length >= 1) {
    if (
      poses[0].pose.keypoints[9].position.x != undefined
    ) {
      if (poses[0].pose.leftWrist.confidence > 0.8
         ) {
        
        // SMOOTHING FUNCTION -- LeftWristX
        // subtract from the last reading:
        totalLWX = totalLWX - readingsLWX[readIndexLWX];
        
        // get pose reading from posenet
        readingsLWX[readIndexLWX] = poses[0].pose.keypoints[9].position.x;
        
        //add this reading to the total:8
        totalLWX = totalLWX + readingsLWX[readIndexLWX];

        // go to the next position of the array:
        readIndexLWX = readIndexLWX + 1;
        
        // if we're at the end of the array...
        if (readIndexLWX >= numReadings) {
          // ...wrap around to the beginning:
          readIndexLWX = 0;
        }
        
        // calculate the average
        averageLWX = totalLWX / numReadings;
        
      }
    }
  }
}

I repeated this process for the other three values (left wrist Y-value, right wrist X-value, and right wrist Y-value).