GuitarLSTM

Deep learning models for guitar amp/pedal emulation using LSTM with Keras
Log | Files | Refs | README

README.md (5053B)


      1 # GuitarLSTM
      2 
      3 GuitarLSTM trains guitar effect/amp neural network models for processing
      4 on wav files.  Record input/output samples from the target guitar amplifier or
      5 pedal, then use this code to create a deep learning model of the
      6 sound. The model can then be applied to other wav files to make it sound
      7 like the amp or effect. This code uses Tensorflow/Keras.
      8 
      9 The LSTM (Long short-term memory) model is effective for copying the sound of 
     10 tube amplifiers, distortion, overdrive, and compression. It also captures the 
     11 impulse response of the mic/cab used for recording the samples. In comparison
     12 to the WaveNet model, this implementation is much faster and can more accurately 
     13 copy the sound of complex guitar signals while still training on a CPU.
     14 
     15 
     16 ## Info
     17 A variation on the LSTM model from the research paper [Real-Time Guitar Amplifier Emulation with Deep
     18 Learning](https://www.mdpi.com/2076-3417/10/3/766/htm)
     19 
     20 
     21 For a great explanation of how LSTMs work, check out this [blog post](https://colah.github.io/posts/2015-08-Understanding-LSTMs/).
     22 
     23 ## Data
     24 
     25 `data/ts9_test1_in_FP32.wav` - Playing from a Fender Telecaster, bridge pickup, max tone and volume<br>
     26 `data/ts9_test1_out_FP32.wav` - Split with JHS Buffer Splitter to Ibanez TS9 Tube Screamer
     27 (max drive, mid tone and volume).<br>
     28 `models/ts9_model.h5` - Pretrained model weights
     29 
     30 
     31 ## Usage
     32 
     33 **Train model and run effect on .wav file**:
     34 Must be single channel, 44.1 kHz, FP32 wav data (not int16)
     35 ```bash
     36 # Preprocess the input data, perform training, and generate test wavs and analysis plots. 
     37 # Specify input wav file, output wav file, and desired model name.
     38 # Output will be saved to "models/out_model_name/" folder.
     39 
     40 python train.py data/ts9_test1_in_FP32.wav data/ts9_test1_out_FP32.wav out_model_name
     41 
     42 
     43 # Run prediction on target wav file
     44 # Specify input file, desired output file, and model path
     45 python predict.py data/ts9_test1_in_FP32.wav output models/ts9_model.h5
     46 ```
     47 
     48 **Training parameters**:
     49 
     50 ```bash
     51 # Use these arguments with train.py to further customize the model:
     52 
     53 --training_mode=0  # enter 0, 1, or 2 for speed tranining, accuracy training, or extended training, respectively
     54 --input_size=150   # sets the number of previous samples to consider for each output sample of audio
     55 --split_data=3     # splits the input data by X amount to reduce RAM usage; trains the model on each split separately
     56 --max_epochs=1     # sets the number of epochs to train for; intended to be increased dramatically for extended training
     57 --batch_size=4096  # sets the batch size of data for training
     58 
     59 # Edit the "TRAINING MODE" or "Create Sequential Model" section of train.py to further customize each layer of the neural network.
     60 ```
     61 
     62 **Colab Notebook**:
     63 Use Google Colab notebook (guitar_lstm_colab.ipynb) for training 
     64 GuitarLSTM models in the cloud. See notebook comments for instructions.
     65 
     66 ## Training Info
     67 
     68 Helpful tips on training models:
     69 1. Wav files should be 3 - 4 minutes long, and contain a variety of
     70    chords, individual notes, and playing techniques to get a full spectrum
     71    of data for the model to "learn" from.
     72 2. A buffer splitter was used with pedals to obtain a pure guitar signal
     73    and post amp/effect signal. You can also use a feedback loop from your
     74    audio interface to record input/output simultaneously.
     75 3. Obtaining sample data from an amp can be done by splitting off the original
     76    signal, with the post amp signal coming from a microphone (I used a SM57).
     77    Keep in mind that this captures the dynamic response of the mic and cabinet.
     78    In the original research the sound was captured directly from within the amp
     79    circuit to have a "pure" amp signal.
     80 4. Generally speaking, the more distorted the effect/amp, the more difficult it
     81    is to train. 
     82 5. Requires float32 .wav files for training (as opposed to int16).
     83    
     84    
     85 ## Limitations and future work
     86 
     87 This implementation of the LSTM model uses a high amount of
     88 RAM to preprocess wav data. If you experience crashes due to 
     89 limited memory, reduce the "input_size" parameter by using 
     90 the "--input_size=" flag with train.py. The default setting is 100,
     91 which requires about 8GB of RAM. Increasing this setting will improve 
     92 training accuracy, but the size of the preprocessed wav data in 
     93 RAM will increase as well.
     94 
     95 You can also use the "--split_data" parameter with train.py to
     96 train the same model on separate sections of the data. This
     97 will reduce RAM usage while still allowing a high input_size
     98 setting. For example, "--split_data=5" would split the data 
     99 into 5 sections, and train each section separately. The default
    100 is 1, or no splitting.
    101 
    102 A custom dataloader has been added to the Colab notebook using MSE
    103 for the loss calculation. This reduces RAM usage and eliminates the 
    104 need for the --split_data parameter.
    105    
    106 A real-time implementation for use in a guitar plugin: [SmartAmpPro](https://github.com/GuitarML/SmartAmpPro)
    107 
    108 Note: The model training has been integrated into the SmartAmpPro plugin, the 
    109 models trained with GuitarLSTM are not currently compatible with the plugin.