SynthScribe: Deep Multimodal Tools for Synthesizer Sound Retrieval and Exploration
CoRR(2023)
摘要
Synthesizers are powerful tools that allow musicians to create dynamic and
original sounds. Existing commercial interfaces for synthesizers typically
require musicians to interact with complex low-level parameters or to manage
large libraries of premade sounds. To address these challenges, we implement
SynthScribe – a fullstack system that uses multimodal deep learning to let
users express their intentions at a much higher level. We implement features
which address a number of difficulties, namely 1) searching through existing
sounds, 2) creating completely new sounds, 3) making meaningful modifications
to a given sound. This is achieved with three main features: a multimodal
search engine for a large library of synthesizer sounds; a user centered
genetic algorithm by which completely new sounds can be created and selected
given the users preferences; a sound editing support feature which highlights
and gives examples for key control parameters with respect to a text or audio
based query. The results of our user studies show SynthScribe is capable of
reliably retrieving and modifying sounds while also affording the ability to
create completely new sounds that expand a musicians creative horizon.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要