Ashvala Vinay

I am a PhD student who spends time stradling the intersection of Music, Machine Learning and Audio Synthesis. Previously, I studied Electronic Production and Design at Berklee. I have worked at multiple startups in the Bay Area at their earliest stages and engineered their products.

Over the course of my journey, I have mastered a variety of tools and technologies that allow me to be an effective ML researcher and programmer. I enjoy working with teams and building new technologies with them.

Education


PhD, Music Technology (with minor in Machine Learning), Georgia Institute of Technology (Mar 2024*)

Advised by Prof. Alexander Lerch. Dissertation work focuses on building evaluation metrics for neural audio synthesizers.
Relevant Coursework: Statistical Machine Learning, Human Computer Interaction

Masters, Music Technology, Georgia Institute of Technology (May 2020)

Relevant Coursework: Deep Learning for Music, Mathematical Foundations of Machine learning, Digital Signal Processing, Music Information Retireval, Interactive Music, Music Perception and Cognition

Bachelors, Electronic Production and Design, Berklee College of Music (May 2016)

TDK Music Technology award for outstanding achievement (Fall 2015)
Dean’s List: Summer 2013, Summer 2015, Spring 2016

Work experience


Graduate Research Assistant, Earsketch, Georgia Institute of Technology (2022 - 2023)

• EarSketch is an educational platform that helps children learn programming through music. the sound recommendation engine and the UI.
• Worked on the sound recommendation engine: augmented the existing engine with information about musical keys, rhythm and beats to make more musically driven suggestions.
• Worked on the redesign to the sound browser, which was implemented in the summer of 2022.

Graduate Research Assistant, Brain Music Lab, Georgia Institute of Technology (2019 - 2022)

• Worked with Prof. Grace Leslie on designing and developing frameworks for analysis of multi-modal data with a specific focus on audio and physiological data from EEG and ECoG in addition to data from sensors such as the Leap Motion.

Research Intern, Corohealth (2021)

• Corohealth is a startup focused on healthcare through music therapy.
• During my time there, I designed and built algorithms that allowed them to generate relevant playlists for a variety of purposes using their internal data.
• I also built user interfaces that allowed them to explore their datasets.

Developer, Keyo AI (2017)

•Developed a property search engine for addresses in the United States.
• Managed a database of 80 million records with data from the United States Census Bureau and public city data.
• The stack was developed with JavaScript, Postgres, Jade and PostGIS.
• Company acquired by Zillow in 2020

Developer, Technical Co-Founder, LiveAds Inc (2016 - 2017)

• Worked on Prompt, an intelligent teleprompter and streaming tool that was contextually aware of interests to aid in livestreaming.
• Pivoted Prompt into Streamstager. Streamstager is a platform to create interactive overlays intended for use in livestreams on social media platforms.
• Used by VISA in the Super Bowl streams they ran from Houston, Texas in 2017.

Developer, Boulanger Labs (2014 - 2016)

• Muse is a Leap Motion exclusive ambient music generation app built on top of OpenFrameworks and Csound.
• Designed in collaboration with Grammy nominated artist BT.
• Inherited an existing codebase where I fixed bugs and optimized it.
• In addition to fixes, helped create a Windows version using the macOS codebase.

Teaching Experience


Instructor, Audio Content Analysis, Georgia Institute of Technology (Fall 2023)

Instructed a course on audio content analysis, covering signal processing, feature extraction, machine learning techniques, evaluation, and MIR methodology.

Teaching Assistant, Music Perception and Cognition, Georgia Institute of Technology (Spring 2021)

Assisted in a course examining how humans process musical sound, covering auditory systems, psychoacoustics, music cognition, and psychology. Responsibilities included teaching lectures and grading.

Teaching Assistant, Music Technology History and Repertoire, Georgia Institute of Technology (Fall 2020)

Assisted in a graduate course on the history, aesthetics, and technology of computer and electronic music. Responsibilities included teaching lectures and grading.

Teaching Assistant, Csound and Audio Programming Classes, Berklee College of Music (2015)

Supported classes on sound design, composition, and audio programming, working extensively with Dr. Richard Boulanger. Managed class examples and provided office hours support.

Publications (Also at: https://ashva.la/publications)


Vinay, Ashvala & Lerch, Alexander. (2023). AQUATK: An Audio Quality Assessment Toolkit. Late Breaking Demos. 24th International Society for Music Information Retreival, Milan, Italy.

Smith, Jason B., Vinay, Ashvala. , & Freeman, Jason. (2023). The Impact of Salient Musical Features in a Hybrid Recommendation System for a Sound Library. Joint Proceedings of the ACM IUI Workshops. 3rd Workshop on Intelligent Music Interfaces for Listening and Creation.

Vinay, Ashvala, and Lerch, Alexander. (2022). Evaluating Generative Audio Systems and their Metrics. Proceedings of the International Society for Music Information Retrieval Conference (ISMIR). ISMIR, Bangalore, IN.

Rahimi, Syedahmad, Smith, Jason B., Truesdell, Erin. J. K., Vinay, Ashvala, Boyer, Kristy E., Magerko, Brian, Freeman, Jason, & Mcklin, Tom (n.d.). Validity and Fairness of an Automated Assessment of Creativity in Computational Music Remixing. Workshop on Automated Assessment and Guidance of Project Work. 24th International Conference on Artificial Intelligence in Education, 2023.

Vinay, Ashvala, Alexander Lerch, and Grace Leslie. (2021). "Mind the beat: detecting audio onsets from EEG recordings of music listening." ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021.

Academic Service


ISMIR 2024

I am currently a member of the ISMIR 2024 organizing group, I will be reprising my role as web-chair and take lead on any infrastructure needs, website development and content management.

MILC 2023

I was one of the organizers of the 3rd Intelligent Music Interfaces for Listening and Creation workshop held at IUI 2023. I helped with the website design and content (https://milc2023.github.io). I also helped with the peer review process.

ISMIR 2022

As a member of the ISMIR 2022 organizing group, I helped with organizing the infrastructure for hosting and deployment used for the conference (http://ismir2022.ismir.net).

ISMIR 2021

As a member of the ISMIR 2021 organizing group, I worked on developing and deploying the webpage for the conference (http://ismir2021.ismir.net). I implemented code in Jekyll, Javascript and Sass for this project.

Projects


AquaTK

• AquaTK = Audio Quality Assessment Toolkit.
• It’s a python based library designed to help with evaluating the audio quality from generative audio models
• Recently demoed at ISMIR 2022 as part of the LBD
• Implements several popular metrics including FAD, Kernel Distances and PEAQ
• 70+ stars on GitHub
Code available on Github

PaperSynth

• PaperSynth is a free, open source iOS app designed to take photos and convert them into interactive synthesizers.
• The original implementation’s OCR algorithm for character and word recognition was created with Keras.
• The repository is one of the top 25 repositories on GitHub for AudioKit.
• The audio engine was written with AudioKit.
Code available on Github

Audio Style Transfer

• Class Project for Special Problems (MUSI-8903).
• We designed, developed and trained a deep neural network to learn representations and classify raw audio data from NSynth data.
• This was then applied to produce style-transfer on timbre.
Code available on Github

Jamiable

• Class Project for Human Computer Interaction.
• Worked on the design and interface building for this project.
• Designed wireframes and created prototype user interfaces.
• Won second place at the Convergence in Innovation competition at Georgia Tech.

Awards


TDK Music Technology Award for outstanding achievement, 2015

2nd place, Convergence in Innovation Competition 2021, Georgia Tech

Skills


Specializations

Machine Learning, Deep Learning, Audio Evaluation, Music Information Retrieval (MIR), Human-Computer Interaction (HCI) and User Interface design.

Programming

Languages: Python, Swift, JavaScript, TypeScript, MATLAB, Haskell, Max/MSP, Csound
Frameworks: PyTorch, Numpy, React, SvelteJS, AudioKit, CoreAudio, CoreML, TidalCycles

Audio Production tools

Logic Pro and Ableton Live