A store website aimed at selling music records, as well as other music related products.
The first step was initializing the repository. The name “Magaro” comes
from the first two letters of our surnames (Macías, García and Rodríguez).
We choose to add a README.md and the MIT License, since it is the most popular
and permissive license available on GitHub. We also added the file _config.yml
to set the initial configuration, such as the project name, the web theme and
a short description, in order to use GitHub Pages to post the project documentation.
The chosen subject for the store is music (CDs, vinyls and cassettes.)
After doing a research on webs and APIs, we selected two potential APIs for our website. These are:
We finally chose Discogs API over iTunes, since it focuses on physical market, which fits better our project.
In order to get ready to do web scraping, it is necessary to setup our project first. So the first step in this process is to create and activate a virtual enviroment.
It is also necessary to install some libraries: pip install requests beautifulsoup4 lxml selenium python-dotenv
This is the recommended project structure:
project/
├── .env
├── requirements.txt
├── scraper.py
├── api_client.py
└── data/
└── output.json
The next step is to extract the data from the web or the API, using the libraries we already installed. Once we extracted the data, we dump them into a JSON file.
To make sure our work is ethical, we must respect the legal terms of service each website provides. This information is stored in the robots.txt file, so we must always check before we start working.
To begin, we need to install the folowing web server:
sudo apt install apache2
The goal is to encrypt the connection between the user’s browser and the server. To implement a secure server we must obtain an SSL/TLS certificate. Let’s Encrpyt is a platform that provides free and open certificates. We also must install Certbot to install the certificate.
The two-factor authenticator(2FA) adds an extra layer of security by requiring a temporary code in addition to the password. In order to enable the 2FA, we must install and enable the mod_authn_otp Apache module. Finally, we also must create a file to stores each user’s keys.
We created two virtual machines based on Linux Mint(v.21.2), one for the database and the other for the web server. The first adapter is configured as NAT to allow internet access exclusively for system maintenance and downloading web dependencies
We created a second adapter for each VM configured as ‘host only’ so both of the VM can communicate with each other.

On our website’s database machine, we installed PostgreSQL, which comes pre-installed on the Linux Mint virtual machine we’re using. We also installed pgAdmin 4 from the Linux Mint software manager application, set a password for its use since we’ll be storing our store’s database there, and the machine is now ready to use.
After studying and analyzing our website, the functional requirements were established.
robots.txt.Since the project structure is simple, it is possible to make the relational model without referencing a entity/relational diagram.

Looking at the relational model, we defined the database structure and we created it in PgAdmin4 by command line. We used the next SQL sentence to create it:
CREATE TABLE artists (
artist_id SERIAL PRIMARY KEY,
discogs_artist_id INT UNIQUE NULL,
name VARCHAR(150) NOT NULL,
profile TEXT
);
CREATE TABLE genres (
genre_id SERIAL PRIMARY KEY,
name VARCHAR(50) NOT NULL UNIQUE
);
CREATE TABLE formats (
format_id SERIAL PRIMARY KEY,
name VARCHAR(30) NOT NULL UNIQUE
);
CREATE TABLE products (
product_id SERIAL PRIMARY KEY,
discogs_product_id INT UNIQUE NULL,
title VARCHAR(255) NOT NULL,
artist_id INT NULL,
genre_id INT NULL,
format_id INT NULL,
release_date DATE,
reference_price NUMERIC(10,2),
origin_url VARCHAR(500),
img_url VARCHAR(500),
last_update TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT fk_products_artists FOREIGN KEY (artist_id)
REFERENCES artists(artist_id) ON DELETE RESTRICT,
CONSTRAINT fk_products_genres FOREIGN KEY (genre_id)
REFERENCES genres(genre_id) ON DELETE RESTRICT,
CONSTRAINT fk_products_formats FOREIGN KEY (format_id)
REFERENCES formats(format_id) ON DELETE RESTRICT

In order to establish a functional connection between the web-server VM and the database VM, the first step is to make some changes in the postgres configuration files, in the database VM.
postgresql.confSearch for the line listen_addresses and change its value to '*'
pg_hba.confTo authorize the web server IP address, it is necessary to edit the pg_hba.conf file, adding a new line at the end of the file with the following information:
| Type | Database | User | IP_address | method |
|---|---|---|---|---|
| host | all | all | 192.168.56.103/24 | scram-sha 256 |