

<?xml version="1.0" encoding="UTF-8"?>
<record>
  <title>Music, Artificial Intelligence, and Deep Learning: A Review of Concepts and Genre Classification Approaches</title>
  <journal>Journal of Intelligent Computing</journal>
  <author>Fangfang Zhou</author>
  <volume>17</volume>
  <issue>1</issue>
  <year>2026</year>
  <doi>https://doi.org/10.6025/jic/2026/17/1/26-36</doi>
  <url>https://www.dline.info/jic/fulltext/v17n1/jicv17n1_3.pdf</url>
  <abstract>The document &quot;Music, Artificial Intelligence, and Deep Learning: A Review of Concepts and Genre
Classification Approaches&quot; provides a comprehensive overview of AI's transformative role in music
information retrieval (MIR), with a focus on music genre classification (MGC). It outlines key AI applications
in music including genre classification, recommendation, melody extraction, therapy, and generation and
traces the evolution from traditional machine learning (e.g., kNN, SVM) to deep learning models (e.g., CNNs,
LSTMs). The review emphasizes that while traditional methods rely on handcrafted audio features like MFCCs,
chroma, and rhythm histograms, deep learning enables automatic feature learning from raw or time-
frequency representations, yielding significantly higher accuracy (up to 97.8% in recent studies). Tables
compare model performance, feature extraction techniques, and trade offs between interpretability and
scalability. The paper acknowledges persistent challenges, such as capturing long term harmonic structure
and the data hungry nature of deep models. It highlights innovations like Shuai Yu's NHAN-GAF for melody
extraction and hybrid architectures (e.g., CNN-LSTM) that integrate temporal and spectral modeling. The
proposed AI-based music processing pipeline spanning audio acquisition, preprocessing, feature extraction,
and deep modeling illustrates an end to end workflow for modern MIR systems. The review concludes that
AI, intense learning, enhances rather than replaces human creativity, and calls for future work on musicaware
networks, attention mechanisms, and larger, diverse datasets to improve robustness and
interpretability in real world applications.</abstract>
</record>
