Lesson 2: Preparing Database: Schema and Data

We must prepare the database before we start working on our Python script. We don't have the exact DB structure and data from the client, so we just created a small database of cars with add-ons.

This will cover the following part of our architecture:

This is how our SQL looks for these:

create table cars
(
id bigint unsigned auto_increment
primary key,
manufacturer varchar(255) not null,
model varchar(255) not null,
year varchar(255) not null,
type varchar(255) not null,
drivetrain varchar(255) not null,
created_at timestamp null,
updated_at timestamp null
) collate = utf8mb4_unicode_ci;
 
create table addon_categories
(
id bigint unsigned auto_increment
primary key,
title varchar(255) not null,
created_at timestamp null,
updated_at timestamp null
) collate = utf8mb4_unicode_ci;
 
create table car_addons
(
id bigint unsigned auto_increment
primary key,
car_id bigint unsigned not null,
addon_category_id bigint unsigned not null,
title varchar(255) not null,
created_at timestamp null,
updated_at timestamp null,
constraint car_addons_addon_category_id_foreign
foreign key (addon_category_id) references addon_categories (id),
constraint car_addons_car_id_foreign
foreign key (car_id) references cars (id)
) collate = utf8mb4_unicode_ci;

This will be enough to provide the results for typical simple customer queries like "Do you have Prius cars newer than 2016".

But the system will be easily extendable. In the future, you may add more columns like price or description, and the GPT engine will understand them.

Next, we want some data inside. For this, we defined a few rows in each table. Here's the SQL for that.

The full SQL is in the GitHub repository.

insert into addon_categories (id, title, created_at, updated_at) values (1, 'Interior', '2024-02-01 15:38:58', '2024-02-01 15:38:58');
insert into addon_categories (id, title, created_at, updated_at) values (2, 'Exterior', '2024-02-01 15:38:58', '2024-02-01 15:38:58');
 
insert into cars (id, manufacturer, model, year, type, drivetrain, created_at, updated_at) values (1, 'BMW', 'M3', '2018', 'sedan', 'RWD', '2024-02-01 15:38:58', '2024-02-01 15:38:58');
insert into cars (id, manufacturer, model, year, type, drivetrain, created_at, updated_at) values (2, 'BMW', 'M4', '2018', 'coupe', 'AWD', '2024-02-01 15:38:58', '2024-02-01 15:38:58');
insert into cars (id, manufacturer, model, year, type, drivetrain, created_at, updated_at) values (3, 'Audi', 'A1', '2015', 'coupe', 'FWD', '2024-02-01 15:38:58', '2024-02-01 15:38:58');
# ... you can find more cars in the GitHub repository
 
insert into car_addons (id, car_id, addon_category_id, title, created_at, updated_at) values (1, 1, 1, 'M Performance Exhaust', '2024-02-01 15:38:58', '2024-02-01 15:38:58');
insert into car_addons (id, car_id, addon_category_id, title, created_at, updated_at) values (2, 1, 2, 'M Performance Carbon Fiber Rear Spoiler', '2024-02-01 15:38:58', '2024-02-01 15:38:58');
insert into car_addons (id, car_id, addon_category_id, title, created_at, updated_at) values (3, 1, 2, 'M Performance Carbon Fiber Front Splitters', '2024-02-01 15:38:58', '2024-02-01 15:38:58');
insert into car_addons (id, car_id, addon_category_id, title, created_at, updated_at) values (4, 2, 1, 'M Performance Exhaust', '2024-02-01 15:38:58', '2024-02-01 15:38:58');
# ... you can find more addons in the GitHub repository

So, you can run those queries to seed that testing data in a local MySQL database:

From here, we will start building our Python script, which will work with that data.


No comments or questions yet...