Spaces:

ahk-d
/

Spleeter-HT-Demucs-Stem-Separation-2025

Running

App Files Files Community

Spleeter-HT-Demucs-Stem-Separation-2025 / stem_separation_spleeter.py

ahkd

test spleeter

b911552 7 days ago

raw

history blame contribute delete

7.4 kB

	# -- coding: utf-8 --
	"""Stem_Separation_Spleeter.ipynb

	Automatically generated by Colab.

	Original file is located at
	https://colab.research.google.com/drive/1ZanGdGmndmOSa0q2arjO1hT5p72n56rm

	# Audio Stem Separation with Spleeter
	Audio stem separation is the task of isolating individual components (stems) from a mixed audio signal, such as vocals, bass, drums, and other instrumental parts. This technology has various applications, including music production, remixing, karaoke, and forensic audio analysis. Spleeter, developed by Deezer Research, is a powerful open-source tool that leverages deep learning to perform this separation efficiently and effectively.

	This Jupyter notebook demonstrates how to use Spleeter, an open-source library developed by Deezer Research for audio stem separation. The notebook provides a step-by-step guide to:

	1. Install the Spleeter library
	2. Separate an audio file into different stems (vocal, bass, drums, and other)
	3. Visualize the separated audio sources using waveforms and spectrograms

	Additionally it is demonstrated how to use Spleeter from the command-line.
	The code is designed to be adaptable for both Google Colab and local Python environments, with clear instructions for running the script in different settings.

	View Spleeter on GitHub: https://github.com/deezer/spleeter

	## Instructions to run the code
	Let's begin by installing spleeter.

	I had problems running this in my local python 3.11 environment because my numpy and librosa versions were incompatible. If this problem happens to you as well, I would recommend using COLAB instead or using an older python version (for example 3.8).
	"""

	!pip install spleeter
	!pip install librosa
	!pip install matplotlib>=3.5

	"""Next, we import all the required packages."""

	from spleeter.separator import Separator
	from spleeter.audio.adapter import AudioAdapter
	from IPython.display import Audio
	import librosa.display
	import matplotlib.pyplot as plt
	import numpy as np

	"""Before running the next cell, please insert a .mp3 / .wav file into this directory and paste the name of it in the variable INPUT_FILE in the code.
	If you are using COLAB, you can find this option by selecting files at the verical bar on the left.

	With the current settings, the model separates vocals, bass, drums and accomplishment from the input-file.
	However, you can change this by setting SPLEETER_MODEL to one of the following:
	- spleeter:2stems
	- separates only Vocals and accomplishment
	- spleeter:4stems
	- separates vocals, bass, drums and accomplishment
	- spleeter:5stems
	- separates vocals, bass, drums, piano and accomplishment

	Lets start by separating our first stems.
	Running this cell might take some time, depending on your setup, the audio file and the selected model
	"""

	# insert your file name here
	INPUT_FILE = 'example_for_demo.mp3'
	INPUT_FILENAME = INPUT_FILE.split('.')[0]
	OUTPUT_DIR = 'outputs'
	SUPPORTED_EXTENSIONS = ('.wav', '.mp3')
	SAMPLE_RATE = 44100
	SPLEETER_MODEL = 'spleeter:5stems' # You might want to change this to 'spleeter:2stems' or 'spleeter:4stems' for different models

	# Initialize separator
	separator = Separator(SPLEETER_MODEL)

	# Load audio
	audio_loader = AudioAdapter.default()
	waveform, _ = audio_loader.load(INPUT_FILE, sample_rate=SAMPLE_RATE)

	# Perform the separation and save the output
	print("Separating audio sources...")
	# prediction = separator.separate(waveform) # separating the audio, without saving it
	separator.separate_to_file(INPUT_FILE, OUTPUT_DIR)

	"""## Displaying Audio, Spectrogram and Waveform

	Lets create two helper-functions that plot the waveform and the spectrogram of an audio file.
	By using different librosa methods, we can easily create such plots.

	Finally we create a method that calls these two methods and additionally displays the audio, so that it can be played directly in this notebook.
	"""

	# Visualization of Waveforms and Spectrograms

	# Plot waveform
	def plot_waveform(waveform, sr, title='Waveform'):
	plt.figure(figsize=(15, 2))
	librosa.display.waveshow(waveform, sr=sr)
	plt.title(title)
	plt.tight_layout()
	plt.show()

	# Plot spectrogram
	def plot_spectrogram(signal, sr, title='Spectrogram'):
	D = librosa.amplitude_to_db(np.abs(librosa.stft(signal)), ref=np.max)
	plt.figure(figsize=(15, 2))
	librosa.display.specshow(D, sr=sr, x_axis='time', y_axis='log')
	plt.title(title)
	plt.colorbar(format='%+2.0f dB')
	plt.tight_layout()
	plt.show()

	def display_audio_and_plots(fileName):
	display(Audio(fileName))
	print(fileName)
	y, sr = librosa.load(fileName)
	plot_waveform(y, sr, f'Waveform of {fileName}')
	plot_spectrogram(y, sr, f'Spectrogram of {fileName}')
	print('---' * 50)

	"""Now we can use this method and display all the relevant formats for the original song and each separated stem.
	Here you should be able to notice some characteristics (depending on the song that you picked):
	- The bass should be located at the lower frequencies of the spectrogram.
	- The spectogram of drums often consists of vertical lines, which means, that they occur in multiple frequencies at the same time and that their hits are of very short time (which is typcal for drums).
	- Vocals often display as horizontal lines in the Spectrogram (depending on the Genre). This is because their tones are often held longer, compared to other instruments.

	In the following cell, you can insert the path of any wav/mp3 file and display the different plots.
	You might have to comment some calls of display_audio_and_plots() depending on the model that you've chosen.
	"""

	display_audio_and_plots(INPUT_FILE)

	# if you are running this in a local environment, you can use the following code to display the audio
	display_audio_and_plots(f'{OUTPUT_DIR}/{INPUT_FILENAME}/other.wav')
	display_audio_and_plots(f'{OUTPUT_DIR}/{INPUT_FILENAME}/vocals.wav')
	display_audio_and_plots(f'{OUTPUT_DIR}/{INPUT_FILENAME}/bass.wav')
	display_audio_and_plots(f'{OUTPUT_DIR}/{INPUT_FILENAME}/drums.wav')
	display_audio_and_plots(f'{OUTPUT_DIR}/{INPUT_FILENAME}/piano.wav')


	# if you are running this in COLAB, you can use the following code to display the audio
	# display_audio_and_plots(f'/content/{OUTPUT_DIR}/{INPUT_FILENAME}/other.wav')
	# display_audio_and_plots(f'/content/{OUTPUT_DIR}/{INPUT_FILENAME}/vocals.wav')
	# display_audio_and_plots(f'/content/{OUTPUT_DIR}/{INPUT_FILENAME}/bass.wav')
	# display_audio_and_plots(f'/content/{OUTPUT_DIR}/{INPUT_FILENAME}/drums.wav')
	# display_audio_and_plots(f'/content/{OUTPUT_DIR}/{INPUT_FILENAME}/piano.wav')

	"""## Running Spleeter on Command Line

	The following Cell demonstrates how to use Spleeter on the command-line. You can execute the cell here in this Notebook, or you copy the command into your command line. Note that if you run this directly on the command-line, you have to remove the "!" before the command.

	Again, you can choose between different Models here. Either 2, 4 or 5 stems can be separated.

	To run Spleeter on comman-line, you have to provide a name of the output directory (here: audio_output), a model and an audio file (here: example_for_demo.mp3).
	"""

	# 2 stems
	!spleeter separate -o audio_output example_for_demo.mp3
	# 4 stems
	# !spleeter separate -o audio_output -p spleeter:4stems example_for_demo.mp3

	# 5 stems
	# !spleeter separate -o audio_output -p spleeter:5stems example_for_demo.mp3