Good gracious, I just found some more data, the Nottingham Database , which is a collection of ABC formatted music files. This format can be put into MIDI format and vice versa. Of course I'm facing a problem, first of all, the specific data I want, MIDI with a lot of different genres, is not widely available. Therefore I have a couple of options: Try to train on actual MP3/OGG/WAV/FLAC music files, which is going to take forever. Although the FMA data set offers 30s samples of the whole collection of songs. NSynth is a collection of single instruments, which is mostly suitable for synthesizing intstruments and not especially for generating songs/music. The Nottingham Database, an ABC formatted data base. The most suitable solution comes in the ABC formatted database, there is more to find and I'm currently tracking down more data. However, there are some caveits along the way. I have found papers using all of these data sets, therefore it is very likely that th
As seen in this updated mindmap below, there is a lot going on internally when making music, but there also is emotional influence of the listener, either intended or not, by the maker. This makes for a more wholesome structure of the research field and includes all the different parts that make music in itself an interesting thing to study. When looking into the technical details of making an artificial music generator there is a part which analyses data, implements learned details (which melodic and song structure are made up of) and the actual generator part which uses the aforementioned learned details and rules to generate music. In a set up with GANs there is the possibility to generate more data with the encoder, while the decoder is fed this information to discriminate between. The encoder is therefore atuned to generate different types of subsets and learns better what the difference with the original data set is. The sequential aspect of music makes it less su
I've been reading about a lot of different things and it is time to focus. Right now there are multiple ways I can solve the music generator problem. One is through Recurrent Neural Networks (RNN's) there is Convolutional Neural Networks (CNN's) and there is a combined approach in which I use a CNN combined with a Generative Adversarial Network (GAN). This amounts to a state of the art approach. RNN's are easier to train and capable of generating good midi files (when listeners proof it) while maintaining structural cohesion in an Encoder Decoder model (EDM). This is a method gives the hidden node states of the Encoder model to the Decoder model to obtain a certain level of music generation that is high level enough to be interesting for my project. However, as it is proven that it is possible to use 2D data as well in RNN's the only reason to choose a CNN & GAN combination is because it has been done in only one paper I found before. While RNN's have been w
Comments
Post a Comment
Thank you for your message!