Magaro

A store website aimed at selling music records, as well as other music related products.

View My GitHub Profile

DAY 1

REPOSITORY CREATION AND GITHUB PAGES CONFIGURATION

The first step was initializing the repository. The name “Magaro” comes from the first two letters of our surnames (Macías, García and Rodríguez). We choose to add a README.md and the MIT License, since it is the most popular and permissive license available on GitHub. We also added the file _config.yml to set the initial configuration, such as the project name, the web theme and a short description, in order to use GitHub Pages to post the project documentation. The chosen subject for the store is music (CDs, vinyls and cassettes.)

API RESEARCH AND CHOSEN API

After doing a research on webs and APIs, we selected two potential APIs for our website. These are:

We finally chose Discogs API over iTunes, since it focuses on physical market, which fits better our project.

WEB SCRAPING RESEARCH

In order to get ready to do web scraping, it is necessary to setup our project first. So the first step in this process is to create and activate a virtual enviroment.

It is also necessary to install some libraries: pip install requests beautifulsoup4 lxml selenium python-dotenv

This is the recommended project structure:

project/
├── .env              
├── requirements.txt
├── scraper.py
├── api_client.py
└── data/
    └── output.json

The next step is to extract the data from the web or the API, using the libraries we already installed. Once we extracted the data, we dump them into a JSON file.

To make sure our work is ethical, we must respect the legal terms of service each website provides. This information is stored in the robots.txt file, so we must always check before we start working.

HOW TO ENABLE SECURE HTTPS SERVER AND TWO-FACTOR AUTHENITACTION

To begin, we need to install the folowing web server: sudo apt install apache2

The goal is to encrypt the connection between the user’s browser and the server. To implement a secure server we must obtain an SSL/TLS certificate. Let’s Encrpyt is a platform that provides free and open certificates. We also must install Certbot to install the certificate.

The two-factor authenticator(2FA) adds an extra layer of security by requiring a temporary code in addition to the password. In order to enable the 2FA, we must install and enable the mod_authn_otp Apache module. Finally, we also must create a file to stores each user’s keys.

VM SETUP AND CONFIGURATION

We created two virtual machines based on Linux Mint(v.21.2), one for the database and the other for the web server. The first adapter is configured as NAT to allow internet access exclusively for system maintenance and downloading web dependencies

We created a second adapter for each VM configured as ‘host only’ so both of the VM can communicate with each other.

1. Web Server VM:

2. Database Server VM:

ARCHITECTURE DIAGRAM

*Data flow diagram*

DAY 2

POSTGRESQL AND PGADMIN4 SETUP

On our website’s database machine, we installed PostgreSQL, which comes pre-installed on the Linux Mint virtual machine we’re using. We also installed pgAdmin 4 from the Linux Mint software manager application, set a password for its use since we’ll be storing our store’s database there, and the machine is now ready to use.

FUNCTIONAL REQUIREMENTS

After studying and analyzing our website, the functional requirements were established.

Data Acquisition

Data Storage

Web Development

Web Presentation

RELATIONAL MODEL

Since the project structure is simple, it is possible to make the relational model without referencing a entity/relational diagram.

*A relational model that shows the database structure*

SQL DATABASE CREATION ON PGADMIN4

Looking at the relational model, we defined the database structure and we created it in PgAdmin4 by command line. We used the next SQL sentence to create it:

CREATE TABLE artists (
    artist_id SERIAL PRIMARY KEY,
    discogs_artist_id INT UNIQUE NULL,
    name VARCHAR(150) NOT NULL,
    profile TEXT
);

CREATE TABLE genres (
    genre_id SERIAL PRIMARY KEY,
    name VARCHAR(50) NOT NULL UNIQUE
);

CREATE TABLE formats (
    format_id SERIAL PRIMARY KEY,
    name VARCHAR(30) NOT NULL UNIQUE
);

CREATE TABLE products (
    product_id SERIAL PRIMARY KEY,
    discogs_product_id INT UNIQUE NULL,
    title VARCHAR(255) NOT NULL,
    artist_id INT NULL,
    genre_id INT NULL,
    format_id INT NULL,
    release_date DATE,
    reference_price NUMERIC(10,2), 
    origin_url VARCHAR(500),         
    img_url VARCHAR(500),         
    last_update TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    

    CONSTRAINT fk_products_artists FOREIGN KEY (artist_id) 
        REFERENCES artists(artist_id) ON DELETE RESTRICT,
    CONSTRAINT fk_products_genres FOREIGN KEY (genre_id) 
        REFERENCES genres(genre_id) ON DELETE RESTRICT,
    CONSTRAINT fk_products_formats FOREIGN KEY (format_id) 
        REFERENCES formats(format_id) ON DELETE RESTRICT

*A diagram showing the database tables structure*

VIRTUAL MACHINES CONNECTION SETUP

In order to establish a functional connection between the web-server VM and the database VM, the first step is to make some changes in the postgres configuration files, in the database VM.

Edit postgresql.conf

Search for the line listen_addresses and change its value to '*'

Edit pg_hba.conf

To authorize the web server IP address, it is necessary to edit the pg_hba.conf file, adding a new line at the end of the file with the following information:

Type Database User IP_address method
host all all 192.168.56.103/24 scram-sha 256