Computer Vision in Art: Seeing the Invisible

Levin outlines the different elements of computer vision that artists and designers must be aware of in order to implement this technology as part of their projects. He also provides a ‘short history’ of the early stages of computer vision in interactive arts pieces, and identifies the major themes that artists have addresses through their work.

I was particularly intrigued by the The Suicide Box (Bureau of Inverse Technology 1996) and Cheese (2003). Natalie Jeremijenko of the Bureau of Inverse Technology reacted to some of the criticism to the project by pointing out that it stemmed from “the inherent suspicion of artists working with material evidence.” Her words are extremely thought-provoking in a context of growing digitisation inasmuch as they force the question: who gets to mobilise digital or digitised data as legitimate evidence? How we answer this question will have consequences for how open and democratic the digital realm ends up being. If we endow everybody with the ability to use and mobilise digital data, then digital platforms can prove themselves to be truly disruptive. If we limit this ability, then we will just be reproducing old structures for producing knowledge.

Cheese successfully objectifies the pressures that different forms of surveillance exert on (female) bodies. In doing so, it highlights one of the most productive areas of computer vision for artists. Computer vision technologies—as well as a host of other data-gathering technologies in the devices we use—are often concealed. By creating environments in which participants can see and react to how they are being perceived and processed as data, and the consequences this has for them and for the information being produced, interactive art relying on computer vision can help people become more aware of the technosocial system we are all embedded in.

Understanding the Second Machine Age Beyond Prediction

In The Digitisation of Just About Everything, the authors explain how the rise in digitisation is changing the nature techno-social systems. They recount the economic properties of information identified by Varian and Shapiro—zero marginal cost of reproduction and the fact that information is non-rival—and add that, in what they regard as a ‘second machine age,’ some information is no longer even costly to produce. All of this is augmented by increasingly better, cheaper and more sophisticated technologies.

At the core of the author’s appraisal of the benefits of digitisation lies the notion that digitisation will help us to better understand and predict different behaviours. There is a strong element of truth in this. Statistically speaking, our models do get better as we have more data—and this is primarily what digitisation has left us, more data. However, I do not think the process will be as straightforward as the authors depict it. More data does not necessarily mean better data. And digitisation is only partially equipped to provide us with that. Digitisation can allow us to record new kinds of information about more people, but that information is limited to the digital realm. As much as technologists want to believe it, there is no one-to-one correspondence between the digital and physical worlds.

Digitisation is an enhancement to old statistical techniques, not a panacea. Our challenge is to understand the consequences of the constant, increasingly more complex interactions between the digital and physical realms. This entails more creativity and audacity than mere statistical prediction, because the encounter between these two worlds in yielding a context that is different from its parts. Once we understand this, we can begin to comprehend new behaviours instead of just extrapolating old ones.

Precision and Interpellation in ‘The Language of New Media’

In The Language of New Media Lev Manovich defines, and traces the evolution of ‘new media’ as a concept and as a form of cultural production. Through a brief recollection of the parallel development of computers and of physical media, Manovich identifies the historical context in which “media becomes new media.” In the late XIX and early XX century, he argues, cultural forms undergo computerization. After this, Manovich moves to point at some principles of new media: numerical representation, modularity, automation, variability, and transcoding.

There are three things I want to highlight about Manovich’s piece. First, his exercise provides with a precise vocabulary with which to distinguish ‘new media’ from ‘media.’ Whether we agree or disagree is another issue, but his effort pointa at the importance of moving toward precision in language rather than to rely on platitudes to describe our changing media landscape. The second aspect I wanted to highlight is a consequence of what I just described. Manovich’s concept of ‘new media,’ and the principles he identifies, are conceptually or theoretically productive because they point at important questions about the nature of the form. In the piece, Manovich himself tries to solve some of these debates, for example, when he debunks the fallacy that all ‘new media’ si digitized analog media, or that ‘new media’ is more interactive than ‘old media.’

Finally, I think Manovich’s piece eloquently reveals an area where processes of isomorphism are understudied in spite of the large consequences they can have for our everyday lives. When discussing the principle of Transcoding, he points at the cultural and cognitive feedback loop between computers and cultural production: we make computers as much as computers make us. This idea is of crucial importance as we think what we lose when we rely on computers for creating and organising different systems of meaning.

Jam Box – Live DJ Music Making Tool (Full Documentation)

Jam Box is a self-contained music making device that controls midi signals sent to Ableton Live, a professional music mixing software. The device contains an arrays of music samples from three different genres mapped into each buttons as well as knobs to control volume, tempo and other parameters. The principle is to allow the user to create their own music and experience what it is like to be a DJ.

Context

While taking a DJ class this semester, I became very passionate about mixing different audio samples and layering sounds to create a whole new piece. I came across midi controllers which are devices (keyboards, pads…) that control notes from a particular digital instrument via Ableton live and other music making software.

 

Midi Keyboard Controller
Midi Controller Pad

Although the devices above require prior knowledge of music software and a basic understanding of DJ-ing jargon (BPM, midi mapping, filters etc.), I wanted to create a tool that allows mainstream people to experience the joy of composing music without worrying too much about the technicalities. Experiencing the same joy myself by DJ-ing live during a school gig was a unique opportunity that I really wanted to share with everyone. This device is therefore a way to invite people to walk in the shoes of a DJ and have a great time.

Concept 

The Jam Box holds music samples from three different genres — arabic music, electronic music and hip-hop. Each genre is mapped to 11 selected music samples arranged on Ableton Live. Whenever a genre is selected (corresponding button on Jam Box is pushed), Ableton Live activates the genre’s “group track” (see video below) in Solo mode. This means that all the 11 buttons (should be 12 but one is not working) on the audio samples keypad are music samples from that particular genre. Hence, when the user presses any button on the audio sample keypad with the Arabic genre selected for example, corresponding samples are triggered within the “Arabic” group.

The user can also select multiple genres at the same time which triggers samples from both genres to create a mix of tracks. The Jam Box also has knobs that control master volume, tempo, filter (removes or adds bass) and pan. The user can modify these parameters at any time during their session.

The Jam Box is also designed such that whenever a button is pressed for the first time, the LED underneath is turned on to signal activity. Whenever it is pressed for the second time, the LED turns off. This adds to the intuitiveness of the Jam Box to allow users to visualize which buttons (samples or genres) are active and which ones are not.

Overview

A 2-minute summary of the project. Excuse my video editing skills 🙂

Materials

Parts 

  • 1 4×4 Adafruit Trellis Monochrome Driver
  • 1 Silicone Elastomer 4×4 button keypad
  • 1 Arduino Redboard
  • 4 10k Potentiometers
  • 4 potentiometer covers (knobs)
  • 16 LEDs (size: 3mm)
  • 10 screws
  • 16 male jumpwires
  • Peel stick paper (for labels)

Tools

  • 3D printer
  • Soldering wire and iron
  • Screwdriver
  • Flush Diagonal Cutter

Software

  • Arduino.cc
  • Processing 3
  • Ableton Live 9

Building Process

Step 1

3D print each parts of the Jam Box‘s enclosure. Digital STL printing files can be found on http://www.thingiverse.com/thing:409733.

Step 2

  • Solder 3 mm LEDs onto Trellis PCB. The longer leg of the LED goes into the positive ‘+’ hole of the Trellis. Cut the excess legs using a flush diagonal cutter.

 

  • Solder 4 wires on the Trellis PCB SDA, SCL, GND and 5V which will connect the SDA, SCL, GND and 5V pins on the Arduino Redboard. 

Test LEDs to make sure each is working before proceeding with the rest. Arduino Code can be found in the “Code” section below.

Step 3

Wire up potentiometers and install them in the Jam Box’s enclosure cover. The potentiometers will be connected to pins A0-A3 on the Arduino Redboard and will share the 5V pin on the RedBoard with the Trellis PCB (Solder both together).

 

Step 4

Assemble all parts of the enclosure as well as the Trellis PCB, the keypad and potentiometers. Use screws to tighten everything together.

 

Code

Arduino

I use arduino to set a serial communication from the physical interactions to the processing sketch. Whenever a button is pressed, Arduino serial prints the number of the pressed button and turn on/off the corresponding LED. It then sends that information to Processing. Likewise, Arduino maps potentiometers’ values from 0 to 127 (following Ableton Live midi mapping directives) and sends that information to the Processing Sketch.

Some of the libraries used are the Adafruit Trellis Library and the Wire library.

Full Arduino Code can be found here: Jam Box Arduino Code

Processing

Processing receives information from Arduino and reads the communication to then trigger a set of actions. For instance, when Button ‘0’ is pressed, Arduino sends a ‘0’ to Processing, and through a series of ‘if’ statements, Processing sends a midi signal to Ableton Live to trigger the music sample corresponding to Button 0.

I used Serial and The Midi Bus libraries. The latter helps convert actions into midi signals sent on a specific channel in Ableton Live.

Full Processing Sketch can be found here: Jam Box Processing Code

Ableton Live

Ableton Live is a music making software designed to receive midi signals. To receive midi signals from Processing, we need to set Ableton’s input to “IAC driver (Bus 1)” which corresponds to the Audio Midi signal from our Macbook laptops and turn on the remote controlling option is Ableton preferences settings.

Following that, we click on Ableton’s MIDI mapping tool to assign each audio sample to a particular button on our Jam Box. By selecting the audio samples and pressing on any button on our keypad, a note is automatically assigned to the sample and represents the link between our box and Ableton’s sample. Likewise, our potentiometers can be mapped to the volume on Ableton’s master track, the tempo or BPM, the master pan and a filter using the same method (MIDI mapping).

NB: Audio samples have to be manually selected and placed in different scenes in Ableton. For simplicity, it would be best to group the audio samples by genres to avoid confusion.

Challenges and Improvements

One of the biggest challenges in realizing this project was on the software side. Creating the right communication between 3 different softwares — Arduino, Processing and Ableton turned out to be quite complicated. The process required creativity at different levels – first in setting up Arduino to print information for all buttons and potentiometers, second in programming Processing to decipher group information sent by Arduino and convert them into individual data to be communicated to Ableton. The most difficult part was navigating Ableton Live.

As this was my first time coming across midi mapping and using Ableton Live with a different perspective, I took this part as a personal challenge to really develop my skills. After getting a good grasp of midi mapping fundamentals thanks to Omar Shoukri, the next challenge was to confront concept with software programming. Questions I had to ask myself were: “What do would make sense for users to do?” “How to improve their experience through making a comprehensive Jam Box with intuitive midi mapping?” “How does what I know now and the limitations of Ableton affect my initial concept?” These questions were very useful in helping me put things in perspective and code my device accordingly. For instance, I had to map “genre” buttons to trigger a “Solo Mode” on Ableton to avoid confusion and unintentional mixes of samples. Because each button is in theory mapped to three different samples from three different genres, pressing a button theoretically triggers all three samples, following Ableton’s logic. Implementing such solutions only became apparent while navigating Ableton Live.

Overall, realizing this project was an amazing learning opportunity that still blows my mind to this very moment. I enjoyed putting together the pieces of the device and using tools that I never encountered before such as a Trellis PCB and 3mm LEDs. Seeing my concept evolve was also very enriching in terms of evaluating the extent to which I can push myself to incorporate features that I did not plan on adding and discovering the power of Ableton Live. It was particularly refreshing to see people impressed and excited about my project during the Interactive Media Showcase. A couple people even asked me if I was selling my product!!!! *mindblown*. Kid Koala himself really enjoyed playing with the device and posted a picture of it on his social media account! He also said that there’s a huge demand for portable midi controllers by DJ’s around the world and that this would sell pretty quickly!

DJ Kid Koala posted my project on his Instagram!!!

In terms of improvements, adding a record button on the pad to allow people to carry their own mixes home would be a great way to create a lasting memory. Adding more flexibility to the range of available sounds would also be a great way to allow people to pick genres or sounds that they like most and go from there. In terms of physical components, using an Arduino Leonardo or on that supports USB communication would cut down from using 3 to 2 different softwares. In theory, this can work by only using Arduino and Ableton Live. Future projects can explore that path perhaps.

Life-Size Totoro: User Testing

Initially, I was going to make a large documentation post which included my user testing notes and the whole process for my final project, since I had not finished my Totoro as early as I intended to do proper user testing. However, in respect of the rest of the class who did separate blog posts, I shall do it as well.

Throughout this whole process, I had two sessions of user testing. The first one was when the code was ready and the logic functioned, but the project was not mounted in the installation yet. Rather, I projected it onto a wall in the IM lab and put the projector on the side as to not interfere with the people’s shadows. The purpose of this initial user testing was to identify whether people actually knew what they had to do with the objects or if more signifiers were needed. A general trend that I identified out of observing the four people who tested my project was a general confusion as to how the umbrella worked. Without any signifiers, they would rotate the umbrella and would change the LEDs’ position, which made the code not work. Also, since there were only two LEDs on each side of the umbrella, a slight rotation would cause it to malfunction, which could be fixed through the addition of more LEDs on both sides. Furthermore, while seeing the two girls in the image, the users assumed they were interactive as well and expected them to do something while they hovered on them. In other cases, such as when the “Glove Mode” was on, most users did not know where their hand was and thought the camera was in front of them rather than in the back, which made them interfere with the IR camera and made the code glitchy as well. Based on these observations I made the following list of improvements:

  • Make more LEDs for the umbrella
  • Fix Totoro’s eyes –> tone down the opacity of the cheeks since people mistake these with his eyes
  • Fix the visuals for the umbrella (the rain gets glitchy and is not accurately updating)
  • Change the X values for the animation frames –> make Totoro move the close one gets to his belly
  • Remove the girls from the background
  • Add a signifier of where the people are, do not only demonstrate the IR lights. Perhaps add Totoro’s eye movement for a solution to this.

After fixing these issues and making the code more reliable, I tested it again in the actual exhibition space. Most of the issues stated beforehand had been addressed and were no longer troublesome. After this user testing session,  I then decided to add a hand silhouette signifier that would show users where their hand was in relation to Totoro, which would also make the interaction more immersive.

Here is a sample video of my user testing:

Jam Box – User Testing

I invited two people to test my device. I did not have any labels on the box and left it to the interpretation of the user. The first candidate found it pretty straightforward and played with each knob and button to discover the individual beats and effects mapped on each. He was looking at Ableton live at the same time and was able to detect the changes in genres. This was perhaps due to the person’s background in music.

The second candidate was pretty confused and did not know what each button did. She suggested I label the “genre” buttons and each of the knobs for more clarity. Another frustration that came along was the fact that one of my buttons was not working, which made the user press on it multiple times before I told her that it wasn’t working.

Since the last candidate would reflect that typical user that I would encounter, I decided to label the buttons and knobs to improve user experience. Another observation from both testers what the absence of a stop button to stop the whole music in case they want to start from scratch. They thought it would be a great way to make another mix if the user decides to.

Overall, the feedback was really helpful in matching both music-experienced and non-music experienced users’ needs and seeing how people interpret the box’s use in general. However, I found that some explanation would be necessary either way before the user utilizes the box to give some context and details about the functionality of the device.

 

User Testing Notes: Atmanna (Wish)

– Tester 1:
  1. Make the animation smoother
  2. Tell a story with it
  3. What do you think of ‘make a wish’
    1.  get a random wish?
  4. It’s pretty clear what needs to be done but it might get confusing if you have a physical object
– Tester 2:
  1. Smoother animation
  2. Talk about why people wish on dandelions
  3. Background music
  4. Think about how to place microphone

Overall I learned that people really enjoyed the visual effect and feeling of satisfaction that you get from blowing a dandelion. I received feedback on how I should make a physical dandelion and what kind of story I’m telling with it.

Final Project: User Testing

Even though my project is still missing its hardware component as well as refined visuals, having my friend (who didn’t know what my project was about) come test it out gave rise to some incredibly useful feedback. Here are some of the things I learned and plan to work on:

  1. My project is about Carnatic music, but not many people at this school are aware of what that is, and even if they are, would probably not be able to tell that the project is in fact based off the raga system. My user felt that without context, the project was confusing and vague.
  2. To this end, she suggested making an informational page in the beginning of the project, helping people understand situate the project in some sort of context.
  3. My user also suggested that I label the parts of the musical instrument, but on understanding that I was planning to create a keyboard-like structure for the instrument, she thought that it might need lesser labelling than she originally thought.
  4. She also thought that my visuals were random and confusing, and suggested that the visuals have more to do with the physical input that the user is doing. In line with that, I have changed my visuals to reflect the exact position of the key that the user is pressing at any given moment.

I hope to conduct another round of user testing once my hardware part is done, which would then hopefully result in a more refined and user-friendly project.

 

Assignment 13: Zen – User Testing

I asked two people who were not familiar with my final project to test it and to offer feedback on what worked and what did not.

When it comes to my first tester, I received the following feedback:

  • The first thing he tried to do (not in the video above) was to run his hand through the flowers, expecting them to react (e.g. bending). Nothing happened, which left him disappointed.
  • He wanted more interaction with the hand flowers – he tried to plant them in the ground, but nothing happened.
  • He also suggested removing the stems of the hand flowers, to make them look more natural.
  • He wanted more body flower colors, to engage the users (“Let’s guess which color will grow next!”) while they are in the calm state.
  • He said that he was waiting for something to happen; and that the waiting was good with a chair, but frustrating when standing.
  • He also remarked that the Kinect silhouette looks ugly and should be smoothed.
  • Additionally, at first, he stood too close to the screen – which meant that he missed out on the wandering-through-the-field interaction.

 

Me second tester had additional feedback:

  • Her first attempted interaction was to run her hand through the background flowers.
  • She also tried to plant the hand flowers into the ground.
  • She attempted to pick up one of the body flowers with her hand, and move it elsewhere.
  • Like the first tester, she did not notice the stomping interaction. She also stood too close to the TV.
  • Unlike the first tester, she said that the color scheme is nice, and that no new colors need to be added to the body flowers.
  • She suggested keeping score – e.g. how many body flowers did one manage to grow, how long one was calm, how many flowers were planted.

 

To address the concerns and suggestions, I plan to do the following for Wednesday:

  1. Mark an interaction area on the ground to signal to users how far away from the TV they should stand.
  2. Implement the planting interaction – this is relatively straightforward, and was one of the top expected interactions.
  3. Allow body flowers to be picked up and planted – similar to the first point, it is simple to implement and would add a playful element to the project, without detracting from the narrative of growth-from-calmness.
  4. If I have time, implement the interaction between the hands and the background flowers – this is the number one expected interaction, but it is more difficult to implement. Also, it encourages people to move around more, which might be distracting from the purpose of the project.

I decided not to alter the color scheme and shape of the flowers since one of my testers liked it; I also decided against keeping score for people’s calmness, because it contradicts the calm-down spirit of the project. Smoothing out the silhouette was deemed too difficult and not too necessary for the quality of the visualization (especially since people’s feet will be hidden by the ground flowers).

Birdy: User Testing

Even though I ended up user testing people who had at least a slight idea of my project/game (e.g there is a bird, you fly around, etc.), I still learned so much from having users try it out. While the game is very simple and cutesy, and even provides some instructions, it still was far less obvious and clear than I thought it was.

While watching the users try the game was interesting and helpful (and sort of funny), the most useful part was hearing their specific feedback after finishing the game. Feedback included the following:

  • the bird moved too slowly
  • the instructions at the top were not that helpful, because it was difficult to look there/read them while the game was still going on
  • it was not clear when the game ended (because even though the user is taken to the main page after they reach 100 points, there is no “you won” message, etc.)
  • the blinking arrows on the sides add too much visual stimulation/confusion, and they interfere with noticing the other blinking elements that signify games
  • there should be something that signifies how many pages/environments there are, so the user knows when they have been to them all
  • the wing flapping directions are (possibly) counterintuitive, in that when you flap the left wing, the bird goes to the right, and vice versa

To fix these issues, I plan on doing the following before Wednesday:

  • making the bird move faster
  • pausing/freezing the game for several seconds whenever there are instructions, giving the user time to read them
  • making the arrows be still rather than blinking
  • naming the environments, and including those names at the top of the screen always, so the user knows which environment they are in, and how many there are total
  • including a message at the beginning telling the user to get to 100 points to win
  • editing the last message to say something more clear, like actually saying the game is finished, etc.
  • reconsidering the flapping directions

A few videos of user testing:

I also plan on having several users who are *completely* unfamiliar with the game to test it out between now and Wednesday.

Overall, user testing was a great idea, and it has helped me figure out what changes I need to make in order to make the game intuitive and enjoyable.