Document Type

Thesis

Date of Award

12-31-2022

Degree Name

Master of Science in Data Science - (M.S.)

Department

Data Science

First Advisor

James Geller

Second Advisor

Mark Cartwright

Third Advisor

Przemyslaw Musialski

Abstract

The effective sound design of environmental sounds is crucial to demonstrating an immersive experience. Classical Procedural Audio (PA) models have been developed to give the sound designer a fast way to synthesize a specific class of environmental sounds in a physically accurate and computationally efficient manner. These models are controllable due to the choice of parameters from analyzing a class of sound. However, the resulting synthesis lacks the fidelity for the preferred immersive experience; thus, the sound designer would rather search through an extensive database for real recordings of a target sound class. This thesis proposes the Procedural audio Variational autoEncoder (ProVE), a general framework for developing a high-fidelity PA model through data-driven neural audio synthesis methods to address the lack of realism in classical PA models. The two-step procedure of training ProVE models is explained through examples of sound classes of footstep sounds and the sound of pouring water.

Furthermore, the thesis demonstrates a web application where users can generate footstep sounds by defining control variables for a pretrained ProVE model to show its capacity for interactive use in sound design workflows. The increase in fidelity from ProVE models is explored through objective evaluations of audio and subjective evaluations against classical PA methods. These results show that these learned neural PA models are feasible for sound design projects. The thesis concludes with a discussion of applications and future research directions.

Recommended Citation

Serrano, Danzel, "A neural analysis-synthesis approach to learning procedural audio models" (2022). Theses. 2097.
https://digitalcommons.njit.edu/theses/2097

Download

Included in

Computer Sciences Commons

COinS

Theses

A neural analysis-synthesis approach to learning procedural audio models

Document Type

Date of Award

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Abstract

Recommended Citation

Included in

Search

Browse

Author Corner

Links

Theses

A neural analysis-synthesis approach to learning procedural audio models

Author

Document Type

Date of Award

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Abstract

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Links