Welcome to Spotichart’s documentation!

Contents:

spotichart package

Subpackages

spotichart.language package

Submodules
spotichart.language.main module

language detector main module

spotichart.language.main.detect_language(text)

Detect the language of a given text

Parameters:text (str) – The text to identify the language
Returns:Language code identified
Return type:str
Module contents

language detector package.

spotichart.lyrics package

Submodules
spotichart.lyrics.lyrics_scraper module

spotichart.lyrics.lyrics_scraper module.

spotichart.lyrics.lyrics_scraper.scrap_lyrics(lyrics_url, headers=False, ad_libs=False)

Download the lyrics from a resorce located on Genius, by web scraping.

Parameters:
  • lyrics_url (str) – Link to the Genius lyrics
  • headers (bool, optional) – Whether keep section headers or not. Section headers explained on: https://genius.com/9250687, defaults to False
  • ad_libs (bool, optional) – Keep the ad-libs sound effects (surrounded by parenthesis). Explained on https://genius.com/9257397, defaults to False
Returns:

Lyrics found

Return type:

str

spotichart.lyrics.main module

spotichart.lyrics.track_features entry module.

spotichart.lyrics.main.get_lyrics(access_token, track_id, track_name, artist)

Get the Lyrics for an individual track

Parameters:
  • access_token (str) – Genius API Access Token
  • track_id (str) – Spotify Track Id, to identify different tracks
  • track_name (str) – Track Name to search
  • artist (str) – Track’s Artist or Performer
Returns:

Dictionary with the song Lyrics, Genius ID an Language identified

Return type:

dict

spotichart.lyrics.main.get_lyrics_from_chart(access_token, chart, sleep=1)

Get track lyrics from a DataFrame with ‘Track Id’, ‘Track Name’ and ‘Artist’ columns

Parameters:
  • access_token (str) – Genius API Access Token
  • chart (pandas.DataFrame) – Pandas DataFrame to know Artist and Track Name
  • sleep (int, optional) – Sleep timer to rest the scraper, defaults to 1
Returns:

Dataframe with Lyrics and Language identified

Return type:

pandas.DataFrame

spotichart.lyrics.track_features module

spotichart.lyrics.track_features module.

spotichart.lyrics.track_features.request_song_info(access_token, track_name, artist)

Search the track’s metadata in Genius

Parameters:
  • access_token (str) – Genius API access token
  • track_name (str) – Track Name
  • artist (str) – Track’s Artist or Performer
Returns:

Genius API response

Return type:

json

spotichart.lyrics.track_features.search_song(access_token, track_name, artist)

Locate the song’s lyrics in Genius, to know its url

Parameters:
  • access_token (str) – Genius API Access Token
  • track_name (str) – Track Name
  • artist (str) – Track’s Artist or Performer
Raises:

ValueError – Error on response

Returns:

Track’s lyrics and id on Genius

Return type:

str

Module contents

lyrics package.

spotichart.spotipy package

Submodules
spotichart.spotipy.audio_features module

spotichart.spotipy.audio_features module.

spotichart.spotipy.audio_features.get_audio_features(access_token, track_id)

Function to fetch the audio features of a song

Parameters:
  • access_token (str) – Spotify Web API Access Token
  • track_id (str) – Spotify Track identifier
Raises:
  • ValueError – Spotify Request Error
  • ValueError – Http Request Error
Returns:

Track’s audio features

Return type:

dict

spotichart.spotipy.main module

spotichart.spotipy.main entry point module.

spotichart.spotipy.main.generate_top_chart(access_token, start, end=None, region='en', chart='top200', sleep=1)

Function to fetch the top chart for a given date, and request their audio features

Parameters:
  • access_token (str) – Spotify Web API Access token
  • start (Date) – Starting point for the scraper to get the top chart
  • end (Date, optional) – Interval for multi-chart table, defaults to None
  • region (str, optional) – Spotify Top 50 region code, defaults to ‘en’
  • chart (str, optional) – Spotify chart to get the data from, either top200 or viral, defaults to ‘top200’
  • sleep (int, optional) – Sleep time for the scraper to rest, defaults to 1
Returns:

Dataframe that stores the chart data, and the audio features for each track

Return type:

pandas.DataFrame

spotichart.spotipy.top_charts module

spotichart.spotipy.top_charts module. Based upon the repo by fbkarsdorp Located on https://github.com/fbkarsdorp/spotify-chart

spotichart.spotipy.top_charts.get_chart(date, region='en', chart='top200')

Download an individual chart

Parameters:
  • date (Date) – Specific date for a Top Chart
  • region (str, optional) – Spotify Top 50 region code, defaults to ‘en’
  • chart (str, optional) – Spotify chart to get the data from, either top200 or viral, defaults to ‘top200’
Raises:

ValueError – Unavailable data requested

Returns:

Top 50 Chart

Return type:

pandas.DataFrame

spotichart.spotipy.top_charts.get_charts(start, end=None, region='global', chart='top200', sleep=1)

Fetch multiple Charts

Parameters:
  • start (Date) – Starting date to download the chart
  • end (Date, optional) – End date for an interval of top charts, defaults to None
  • region (str, optional) – Spotify Top 50 region code, defaults to ‘global’
  • chart (str, optional) – Spotify chart to get the data from, either top200 or viral, defaults to ‘top200’
  • sleep (int, optional) – Sleep time for the scraper to rest, defaults to 1
Raises:

ValueError – Invalid date interval format

Returns:

Chart with the Top 50 basic data

Return type:

pandas.DataFrame

Module contents

spotipy package.

Module contents

spotichart package.

Domain Model

Domain Model

Attributes involved in the scope of this package. Provided from Spotify Top Charts, Spotify Web API, Genius API, and Genius itself (by web scraping the lyrics).

Model View

Therefore, the providers and the inner modules would look this way, using the Model View with Use Style.

Model View

Additionaly, the guess-language-spirit package is used to detect the lyrics language. The language package and its dependency are structured that way, so is easier to switch the language-detecting provider.

Indices and tables

Spotichart

Badge Documentation Status Codacy Badge

Collector Module for Spotify National Trending Analysis

Introduction

The Spotichart module makes it easy for data scientist and programmers get the features from the trending songs on Spotify. You can define period of time and a region and get the main characteristics of the top songs.

Documentation

The oficial documentations is available at: Read The Docs

Installation

TODO: Not Yet Published

$ pip install spotichart

Requirements

  • Python >= 3.6
  • Spotify Web API Access Token, you can request yours here and click on GET TOKEN. Then copy the token on the OAuth Token field.
  • (Optional) Genius Web API Access Token. From the official docs page you can just select Authenticate wih the Docs App To Try, and copy the Authorization Bearer provided after logging in.

Synopsis

Usage

Just to get the audio features, given a date (or period) and a region

import spotichart

spotify_token = 'YOUR-ACCESS-TOKEN-FROM-THE-WEB-API'

chart = spotichart.generate_top_chart(spotify_token, start='2019-01-01', end='2019-10-13', region='mx')

To additionally retrieve each song’s lyrics, Genius ID an auto-detect the language, you can do as well:

import spotichart

spotify_token = 'YOUR-SPOTIFY-ACCESS-TOKEN-FROM-THE-WEB-API'
genius_token = 'YOUR-GENIUS-ACCESS-TOKEN-FROM-THE-WEB-API'

chart = spotichart.generate_top_chart(spotify_token, start='2019-01-01',
                                   end='2019-10-13', region='mx', sleep=0.5)

chart_with_lyrics = spotichart.get_lyrics_from_chart(genius_token, chart, sleep=0.1)

Note: Since these functions imply web requests to get the data, the sleep parameter is meant to make the algorithm rest and avoid the server to refuse the requests. By default sleep is set to 1 second.

The DataFrame

A pandas.DataFrame will be generated with the data of interest:

>>> chart
       Position                                      Track Name           Artist  Streams  ... speechiness    tempo time_signature  valence
0             1                                   Calma - Remix       Pedro Capó   737894  ...      0.0524  126.899              4    0.761
1             2                                      Adan y Eva     Paulo Londra   415066  ...      0.3360  171.993              4    0.720
2             3  Taki Taki (with Selena Gomez, Ozuna & Cardi B)         DJ Snake   409061  ...      0.2290   95.948              4    0.591
3             4                               MIA (feat. Drake)        Bad Bunny   377855  ...      0.0621   97.062              4    0.158
4             5                               A Través Del Vaso    Grupo Arranke   346975  ...      0.0297  143.851              3    0.920
...         ...                                             ...              ...      ...  ...         ...      ...            ...      ...
14295        46                                       Con Calma     Daddy Yankee   141397  ...      0.0593   93.989              4    0.656
14296        47                          La Escuela No Me Gustó    Adriel Favela   139350  ...      0.0371  112.548              4    0.844
14297        48                          De Los Besos Que Te Di  Christian Nodal   139294  ...      0.0422  195.593              4    0.709
14298        49                                   Pa Mí - Remix            Dalex   137812  ...      0.2200  170.018              4    0.727
14299        50                                         Circles      Post Malone   131109  ...      0.0395  120.042              4    0.5

[14300 rows x 20 columns]