AM05 AUT24 Outfit Of The Day Recommendation

goodlunn / 2024-12-14 / 原文

AM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System1

AM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System

Introduction

Welcome to your final project for the Data Management course. This project isdesigned to integrate and apply the skills you've acquired throughout thecourse, including data acquisition, web scraping, ETL processes, SQL databasemanagement, automation with Bash scripts, and API development using R.

ou will create an Outfit Of The Day Recommendation System (Outfit RecSys)that recommends daily outfits based on the current weather in London. The

system will scrape clothing items from websites, store them in a database,retrieve weather data from a public API, and provide outfit recommendationsthrough an API endpoint.

Project Overview

The Outfit RecSys should:

Database: Contain a database of at least 25 clothing items, scraped froman appropriate fashion website - including:5 pairs of shoes5 bottoms (e.g., pants, skirts)5 tops (e.g., shirts, blouses)5 coats or jackets5 accessories (e.g., umbrellas, sunglasses)

API Endpoint: Be accessible through an API endpoint using Plumber in R.

Functionality:When the API is called, it should:

Check the current weather in London.AM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System2

Generate an outfit from the closet database using simple rules.

Create a plot showing the weather forecast and images of therecommended outfit.

Project Components

This project consists of several interconnected components:

  1. Data Acquisition and Scraping: Scrape at least 25 clothing items, including5 from each category (shoes, bottoms, tops, coats, and accessories).
  1. Data Processing and ETL: Clean and store the scraped data into a SQLdatabase using the provided schema.
  1. Weather Data Integration: Use the Weatherstack API to get currentweather data for London and integrate it into your recommendation system.
  1. Recommendation System: Build a simple rules-based recommendersystem that generates an outfit based on the weather conditions in London.
  1. API Development: Implement an API using Plumber in R with twoendpoints:/ootdto get the outfit recommendation andrawdatato return allproduct data.
  1. Automation: Automate the entire workflow using Bash scripts.Detailed Instructions

Use the following names for your scripts:

  1. product_scraping.R
  1. weatherstack_api.R
  1. etl.R
  1. ootd_api.Rrunootd_api.Rrun_pipeline.sh
  1. Data Collection and Web Scraping

Objective: Scrape product images and information to populate your closetdatabase.AM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System3

Instructions:

Choose a Website: Select an online clothing retailer that allows webscraping (ensure compliance with the website's terms of service).

Scrape Data:Collect at least 25 items covering the categories mentioned above.

For each item, collect the following information:Product Name

Category (e.g., shoes, tops)

Image URL

Download Images:Save the product images locally in a folder named

images, located inyour project folder.

Example Code Snippet:

# product_scraping.Rlibrary(rvest)

# Example: Scraping product names and image URLs

url <- "https://www.example.com/clothing"

webpage <- read_html(url)

product_names <- webpage %>% html_nodes(".product-name") %

>% html_text()

image_urls <- webpage %>% html_nodes(".product-image") %>%

html_attr("src")

# Download images

for(i in seq_along(image_urls)) {

 

download.file(image_urls[i], destfile = paste0("images/",

product_names[i], ".jpg"), mode = "wb")

}AM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System

4

# Create a data frame

products <- data.frame(

 

name = product_names,

 

category = ..., # Extract category

 

image_path = paste0("images/", product_names, ".jpg"),

 

stringsAsFactors = FALSE

)# Save data frame for ETL processwrite.csv(products, "products_raw.csv", row.names = FALSE)

Note: Replace selectors like".product_name"

with the actual CSS selectors fromthe chosen website.

  1. Weather Data AcquisitionObjective: Retrieve current weather data for London using the WeatherstackAPI.

Instructions:You should already have a 代 写AM05 AUT24 Outfit Of The Day Recommendation Weatherstack API account and API key fromAssignment #1. Otherwise follow the instructions below:Sign Up: Register for a free API key at Weatherstack.

Store API Key: Save your API key in an environment variable named

YOUR_ACCESS_KEY

.Access Weather Data:

Example Code Snippet:

# weatherstack_api.R

library(httr)

library(jsonlite)

# Retrieve API key from environment variable

api_key <- Sys.getenv("YOUR_ACCESS_KEY")

# Construct API requestAM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System

5

response <- GET(

 

url = "http://api.weatherstack.com/current",

 

query = list(

 

access_key = api_key,

 

query = "London"

 

)

)

# Parse response

weather_data <- content(response, as = "text") %>% fromJSON

(flatten = TRUE)

# Extract relevant information

current_temperature <- weather_data$current$temperature

weather_descriptions <- weather_data$current$weather_descri

ptions

# Save weather data for use in recommendation logic

saveRDS(weather_data, "weather_data.rds")

  1. ETL Process and Database ManagementObjective: Clean and store product data into a SQL database.

Instructions:Create a Database: Use SQLite for simplicity (no server setup required).

Define Schema: Ensure all students use the same schema.Schema:

CREATE TABLE closet (

 

id INTEGER PRIMARY KEY AUTOINCREMENT,

 

name TEXT,

 

category TEXT,

 

image_path TEXT

);

ETL Process:AM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System

6

Read the raw product data from

products_raw.csv

.

Clean the data (e.g., handle missing values).

Insert the cleaned data into the

closet

table.Example Code Snippet:

# etl.R

library(RSQLite)

library(dplyr)

# Read raw data

products <- read.csv("products_raw.csv", stringsAsFactors =

FALSE)

# Data cleaning (example)

products_clean <- products %>%

 

filter(!is.na(name), !is.na(category), !is.na(image_pat

h))

# Connect to SQLite databaseconn <- dbConnect(SQLite(), dbname = "closet.db")

# Write data to databasedbWriteTable(conn, "closet", products_clean, overwrite = TRUE, row.names = FALSE)

# DisconnectdbDisconnect(conn)

  1. Outfit Recommendation LogicObjective: Implement rules-based logic to recommend outfits based onweather conditions.Instructions:Define Rules:Temperature > 25°C: Light clothing (e.g., t-shirts, shorts, sandals).AM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System7Temperature 15°C - 25°C: Comfortable clothing (e.g., long-sleeve tops,jeans, sneakers).

Temperature < 15°C: Warm clothing (e.g., jackets, sweaters, boots).

Rain Forecast: Include a raincoat or umbrella.Sunny: Suggest sunglasses.

Implement Logic: Use R to query the database and select items matching

the rules.

Example Code Snippet (within

ootd_api.R

):

# ... within the /ootd endpoint function

# Load weather data

weather_data <- readRDS("weather_data.rds")

temperature <- weather_data$current$temperature

weather_desc <- weather_data$current$weather_descriptions

# Connect to database

conn <- dbConnect(SQLite(), dbname = "closet.db")

# Initialize outfit list

outfit <- list()

# Apply rules

if (temperature > 25) {

# Select light clothing

outfit$top <- dbGetQuery(conn, "SELECT * FROM closet WHER

E category = 't-shirt' LIMIT 1")

outfit$bottom <- dbGetQuery(conn, "SELECT * FROM closet W

outfit$shoes <- dbGetQuery(conn, "SELECT * FROM closet WH

ERE category = 'sandals' LIMIT 1")

} else if (temperature >= 15 && temperature <= 25) {

# Select comfortable clothing

} else {

# Select warm clothingAM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System8

}

# Check for rain

if (grepl("Rain", weather_desc)) {

 

outfit$accessory <- dbGetQuery(conn, "SELECT * FROM close

t WHERE category = 'umbrella' LIMIT 1")

}

# Disconnect

dbDisconnect(conn)

# Proceed to create the plot with selected items

  1. API Development with PlumberObjective: Develop two API endpoints using Plumber in R.Endpoints:

/ootd : Returns a plot showing the outfit recommendation.

/rawdata

: Returns all product data as a JSON object.Instructions:

Setup Plumber: Install and load theplumber

package.Define Endpoints:

Example Code Snippet:

# ootd_api.R

library(plumber)

library(DBI)

library(RSQLite)

library(jsonlite)

#* @apiTitle Outfit Recommendation API

#* Get Outfit of the DayAM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System

9

#* @get /ootd

function() {

 

# Implement recommendation logic (as per previous sectio

  1. n)

# Create a plot

plot.new()

# Example plot code:

 

plot.window(xlim=c(0,1), ylim=c(0,1))

 

text(0.5, 0.9, paste("Date:", Sys.Date()), cex=1.5)

 

text(0.5, 0.8, paste("Weather:", weather_desc), cex=1.2)

 

# Add images (this is a placeholder, you need to use func

tions like rasterImage)

 

# Return the plot

}

#* Get Raw Product Data

#* @get /rawdata

function() {

 

conn <- dbConnect(SQLite(), dbname = "closet.db")

 

data <- dbGetQuery(conn, "SELECT * FROM closet")

 

dbDisconnect(conn)

 

return(toJSON(data))

}

  1. Guidance on the Outfit of the Day Format

You are required to generate an outfit recommendation output that presents the

selected items in a clear and visually appealing manner. This output will be a

key component of your project's deliverables, particularly when testing your

/ootd

API endpoint. Below are the guidelines to help you create an effective

recommendation output.

Essential Components

Your

/ootd

recommendation output is an image that must include the following

elements:

  1. Date and Weather Forecast:

Today's Date: Display the current date prominently.AM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System

10

Weather Forecast: Include a brief description of the weather

conditions, such as temperature, weather descriptions (e.g., sunny,

rainy), and any other relevant details retrieved from the Weatherstack

API.

  1. Outfit Images:

Clothing Categories: The outfit must consist of images representing

each of the following categories:

Shoes

Bottom (e.g., trousers, jeans, skirts)

Top (e.g., shirts, sweaters, blouses)

Outerwear (e.g., jackets, coats)

Accessory (e.g., sunglasses, umbrella, bag)

Image Quality: Ensure that the images are clear and of high quality sothat the details of each item are visible.Layout and Presentation You have creative freedom in how you present the outfit images, but yourlayout should adhere to the following guidelines:Clarity and Visibility: Arrange the images in a way that each item is fully visible and notobscured by other elements.Avoid overlapping images unless it enhances the presentation withoutcompromising clarity.Layout Options: Mosaic/Grid Layout: Place the images in a grid format, aligning themneatly in rows and columns. This approach ensures that each item hasits own space.

Stylistic Overlay: If you prefer a more creative approach, you canoverlay the images to mimic how the outfit would look when worntogether. Ensure that this methodstill allows each item to be distinctlyidentified.Labels and Annotations (Optional):AM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System11You may include labels or brief descriptions next to each item toindicate the category or any special features.Use legible fonts and colours that contrast well with the backgroundand images.

Example Approaches

Here are some ideas on how you might structure your output:

  1. Mosaic/Grid Example:
  2. Stylistic Overlay Example:AM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System12Technical Implementation TipsImage Processing with magick

: Use themagickpackage in R to manipulate and combine images.Ensure that all images are resized proportionally to maintain aspect

ratios.

Useimage_append()orimage_montage()functions to arrange images in agrid.For overlays, useimage_composite()with appropriate gravity and offsets.Adding Text Annotations: Useimage_annotate()to add the date and weather information at the topor bottom of the output image.Choose font sizes and styles that are readable and professional.File Formats and Sizes: Save the final output as a PNG or JPEG file.

Optimise the image size to balance quality and file size.Testing Your Output Visual Inspection:AM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System13Open the generated image to ensure that all elements are displayedcorrectly.Check for any distortions or misalignments.

Consistency with Recommendation Logic: Verify that the selected items align with your recommendation logicbased on the weather data.Ensure that accessories like umbrellas are included on rainy days.

  1. Automation with Bash ScriptsObjective: Automate the entire pipeline so that the assessor can run your Bashscript and retrieve the outfit recommendation.Instructions:Create a Bash Script: Name irun_pipeline.sh.Script Requirements:Accept an input variable for the Weatherstack access key.Example:

#!/bin/bash

# Usage: ./run_pipeline.sh YOUR_ACCESS_KEY

YOUR_ACCESS_KEY=$1

export YOUR_ACCESS_KEY

# Run R scripts

Rscript product_scraping.R

Rscript weatherstack_api.R

Rscript etl.R

Rscript run_ootd_api.R &

# Wait for API to start

sleep 5

# Call the /ootd endpointAM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System

14curl "<http://localhost:8000/ootd>" --output ootd_plot.pngecho "Outfit of the Day plot saved as ootd_plot.png"Run OOTD API: Therun_ootd_api.Rscript should start the Plumber API onport 8000.

Example Code Snippet:

# run_ootd_api.R

library(plumber)

# Load the API

r <- plumb("ootd_api.R")

# Run the API on port 8000r$run(port = 8000)

Deliverables

Project Folder: A zipped folder namedwin-123456.zipor

mac-123456.zip,

where123456s your student number.

  1. Bash Script:A script namedrun_pipeline.sh

that:Takes an input variable for the Weatherstack access key

(

YOUR_ACCESS_KEY

).

Loads and runs all relevant R scripts.

Makes a call to the

/ootd

endpoint using

curl

to produce the plot

of the Outfit of the Day.

  1. R Scripts:

product_scraping.R

weatherstack_api.R

etl.RAM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System

15

ootd_api.R

run_ootd_api.R

  1. Product Images:

A folder namedimages

containing the product images associated withyour closet database.

Example outfit image

  1. Outfit generated from your

/ootdendpoint with file nameootd_plot.png.Readme fileREADME Updates: In yourREADME.md, how a section that explains how therecommendation output is generated.Provide any instructions necessary to reproduce the output.Important Notes Environment Variables: Ensure your API key is retrieved from anenvironment variable that is passed to the bash script from the commandline.

Example:

#!/bin/bash

# example_script.sh

# Usage: ./example_script.sh YOUR_API_KEY

# Check if the API key is providedif [ -z "$1" ]; thenecho "Usage: $0 YOUR_API_KEY"exit 1fi# Get the API key from the command-line argument

YOUR_API_KEY=$1AM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System16

# Export the API key as an environment variableexport YOUR_API_KEY

# Now you can use the API key in your script or in scri

echo "API key has been set as an environment variable."

# Example usage within the scriptecho "Using the API key in the script:"

echo "The API key is: $YOUR_API_KEY"

# Example of running another script that uses the API ke

# Assuming you have a script called api_call_script.sh

# ./api_call_script.sh

# Alternatively, run an R script that uses the API key

# Rscript my_r_script.R

Suppose your API key is abcd1234. You would run the script as follows:

./example_script.sh abcd1234

Using the API Key in an R Script (e.g., my_r_script.R):

# my_r_script.R

# Retrieve the API key from the environment variable

api_key <- Sys.getenv("YOUR_API_KEY")

if (api_key == "") {

stop("API key not found. Please ensure YOUR_API_KEY is}

# Use the API key in your API calls

# For example:

library(httr)

response <- GET("https://api.example.com/data", add_headAM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System

17# Process the response as needed

Port Configuration: The API should run on port8000

. Ensure no otherservices are using this port.Dependencies: List any R packages required in aREADME.mdfile.Testing: Verify that your entire pipeline works on a different machine to

ensure it runs outside of your development environment.

Assessment Criteria (Total: 100 points)

  1. Data Collection and Scraping (15 points)Quality and completeness of the web scraping script (10 points).Variety and coverage of items across different categories (5 points).Database Design and Implementation (10 points)Correct SQL database design according to the specified schema (5points).Successful population of the database with scraped items (5 points).
  1. Weather Integration (10 points)Successful integration and automation of weather data retrieval (5points).Correct usage and storage of weather data in the system (5 points).
  1. Outfit Recommender System (20 points)Effectiveness of the recommendation logic (10 points).Proper implementation using R (10 points).
  1. Automation and Workflow (15 points)Use of Bash scripts to automate tasks (10 points).Correct execution of the entire pipeline from the script (5 points).
  1. Code Quality and Documentation (10 points)Code readability and adherence to best practices (5 points).AM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System18Clear documentation and instructions in aREADME.mdfile (5 points).
  1. OOTD Endpoint Functionality (20 points)/ootdendpoint returns a plot showing date, weather forecast, and outfitimages (10 points)./rawdataendpoint returns all products in the closet database as JSON(10 points).

Bonus steps / functionality (10 bonus points) 50+ products are added to your closet database (5 points)/ootdendpoint has additional functionality to product two or more outfitchoices for each call rather than 1 outfit. (5 points)Submission Instructions Deadline: See canvas assignment page.

File Naming: Ensure your zipped folder follows the naming convention (win-

123456.zip

or

mac-123456.zip

).

Tips and Best Practices

Testing: Run your Bash script from start to finish to ensure all components

work seamlessly.

Error Handling: Include error checks in your scripts to handle potential

issues (e.g., missing data, API errors).

Comments: Comment your code to explain the logic and flow.

Dependencies: Use

renv

or list your packages to ensure the assessor can

install them easily.

Security: Do not hardcode your API keys in the scripts; always use

environment variables.

Data Privacy: Ensure compliance with data scraping regulations and

respect website terms of service.AM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System

19

Getting Started

  1. Set Up Your Environment:

Install necessary R packages:

rvest

httr

jsonlite

DBI

RSQLite

plumber

Ensure you have

curl

installed for making HTTP requests in the Bash

script.

  1. Plan Your Approach:

Review the requirements and plan each step.

Start by setting up your database schema.

  1. Incremental Development:

Test each component individually before integrating.

Use print statements or logs to debug.

  1. Consult Course Materials:

Revisit workshops and assignments related to each component.

Support

If you have any questions or need clarification, please reach out during office

hours or via email at jfrancis@london.edu.

Good luck with your project!

APPENDIX 1.0 - Guidelines for Your README FileAM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System

20

Your

README.md

file is a crucial part of your project submission. It should provide

clear instructions and information to help others understand and run your

project without any confusion. Below are some key points you should include:

Project Title and Description:

Clearly state the name of your project.

Provide a brief overview of the project's purpose and functionality.

Table of Contents (Optional for Longer READMEs):

If your README is extensive, include a table of contents to help readers

navigate the document.

Prerequisites and Dependencies:

List all software, packages, and libraries required to run your project.

For example: R (version X.X.X), SQLite, Bash shell,

rvest

,

httr

,

jsonlite

, etc.

Include any system requirements or platform-specific instructions.

Provide commands or steps to install these dependencies.

Installation and Setup Instructions:

Step-by-step guidance on how to set up the project environment.

Cloning or downloading the project repository.

Setting up directories and files.

Instructions on obtaining and setting up the Weatherstack API key.

How to export the API key as an environment variable if needed.

Project Structure Overview:

Briefly describe the purpose of each major script and file in your

project.

product_scraping.R

: Scrapes product data and images from the web.

weatherstack

_

api.R

: Fetches current weather data using the

Weatherstack API.

etl.R

: Cleans data and populates the SQLite database.

ootd

_

api.R

: Defines the API endpoints using Plumber.AM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System

21

run_ootd

_

api.R

: Runs the API server.

run

_

pipeline.sh

: Bash script that automates the entire pipeline.

images/

: Directory containing product images.

closet.db

: SQLite database file containing the closet data.

Mention any additional files or directories, such as logs or outputs.

Usage Instructions:

How to run the entire pipeline using the Bash script.

Example command:

./run_pipeline.sh YOUR_ACCESS_KEY

Instructions on how to start the API server independently if needed.

Example command:

Rscript run_ootd_api.R

How to access the API endpoints.

Accessing

/ootd

and

/rawdata

via a web browser or using

curl

.

Example:

curl "<http://localhost:8000/ootd>" --output ootd_plot.png

Any additional steps required to generate the outputs.

Recommendation Logic Explanation:

Describe how the weather data influences the outfit recommendation.

Temperature thresholds and corresponding clothing choices.

Handling of specific weather conditions (e.g., rain).

Any additional logic or rules implemented.

Output Description:

Details about the generated outputs, such as the outfit plot image.

Explain the contents and format of

ootd_plot.png

.

Includes date, weather forecast, and images of the outfit items.

Mention any other output files and their purposes.

Additional Features (Bonus Implementations):

Describe any extra items added to the closet beyond the required 25.

Detail any additional API endpoints you have created.

Their purposes and how to access them.AM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System

22

Explain if you have implemented multiple outfit suggestions.

Troubleshooting and FAQs:

Common issues that might arise and their solutions.

API key errors.

Missing dependencies.

Port conflicts if the API server doesn't start.

Tips for ensuring the scripts run smoothly.

Dependencies and Package Installation:

Provide a list of R packages and how to install them.

Example:

install.packages(c("rvest", "httr", "jsonlite", "DBI", "RSQLite",

"plumber", "dplyr", "magick"))

Instructions for installing any system-level dependencies if applicable.

License Information (Optional):

Specify any licenses if you are using third-party code or resources.

Contact Information (Optional):

Your name and email address for any questions or feedback.

Acknowledgments (Optional):

Credit any resources, tutorials, or individuals that helped you.

Formatting Tips:

Use Markdown syntax to structure your README:

Headings (

#

,

##

,

###

) for sections and subsections.

Bullet points and numbered lists for clarity.

Backticks for inline code (

code

) and triple backticks for code blocks.

Hyperlinks for referencing external resources or documentation.

Example of a Command:

./run_pipeline.sh YOUR_ACCESS_KEY

Example of Inline Code:AM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System

23

To install packages:

install.packages("package_name")

Final Checklist:

Clarity and Conciseness:

Ensure instructions are easy to follow and free of jargon.

Keep sentences and paragraphs short and to the point.

Completeness:

Double-check that all required sections are included.

Verify that all instructions are accurate and up-to-date.

Proofreading:

Check for spelling and grammatical errors.

Ensure consistent formatting throughout the document.

APPENDIX 2.0 - Passing Variables, Data, and Files

Between Scripts in a Pipeline

In a data processing pipeline, it's essential to pass variables, data, and files

from one script to another to ensure seamless execution and maintain

modularity. This practice allows different components of the pipeline to

communicate and share necessary information without tightly coupling the

scripts. Below are various methods to achieve this, along with explanations of

their importance and examples based on the Final Project Assignment:

Personal Outfit Recommendation System.

  1. Command-Line Arguments

Explanation:

Scripts can accept input parameters directly from the command line when

they are executed.

This method allows you to pass variables, such as API keys or file paths,

dynamically.

Why It's Important:AM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System

24

Flexibility: Users can specify different inputs without modifying the script

code.

Security: Sensitive information like API keys can be passed at runtime

instead of hardcoding them.

Example from the Project:

Passing the Weatherstack API Key:

In the Bash script

run_pipeline.sh

, the API key is passed as a command-line

argument:

./run_pipeline.sh YOUR_ACCESS_KEY

Within

run

_

pipeline.sh

, the API key is captured and exported:

#!/bin/bash

# Check if the API key is provided

if [ -z "$1" ]; then

 

echo "Usage: $0 YOUR_ACCESS_KEY"

 

exit 1

fi

# Export the API key as an environment variable

export YOUR_ACCESS_KEY=$1

Each R script can then access the API key from the environment variable.

  1. Environment Variables

Explanation:

Environment variables are key-value pairs available to all processes in the

shell session.

Scripts can read environment variables to obtain necessary information.

Why It's Important:

Security: Keeps sensitive data out of the codebase.

Consistency: Ensures that all scripts access the same variable values.AM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System

25

Portability: Environment variables can be easily configured on different

systems.

Example from the Project:

Accessing the API Key in R Scripts:

In

weatherstack

_

api.R

, the API key is retrieved from the environment:

# Retrieve the API key from the environment variable

api_key <- Sys.getenv("YOUR_ACCESS_KEY")

if (api_key == "") {

 

stop("API key not found. Please ensure YOUR_ACCESS_KEY

is set as an environment variable.")

}

  1. Reading and Writing Files

Explanation:

Scripts can write data to files, which subsequent scripts read and process.

Common file formats include CSV, JSON, RDS (R's binary format), and

databases.

Why It's Important:

Data Persistence: Stores intermediate results that can be reused or

inspected.

Decoupling: Allows scripts to operate independently, focusing on specific

tasks.

Debugging: Facilitates troubleshooting by examining intermediate files.

Example from the Project:

Sharing Scraped Data:

product_scraping.R

: Scrapes product data and saves it to a CSV file.

# Save raw product data

write.csv(products, "products_raw.csv", row.names = F

ALSE)AM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System

26

etl.R

: Reads the CSV file for data cleaning and loading into the

database.

# Read the raw product data

products <- read.csv("products_raw.csv", stringsAsFac

tors = FALSE)

Storing Weather Data:

weatherstack

_

api.R

: Fetches weather data and saves it as an RDS file.

# Save weather data to an RDS file

saveRDS(weather_data, "weather_data.rds")

ootd

_

api.R

: Reads the weather data for generating outfit

recommendations.

# Load weather data

weather_data <- readRDS("weather_data.rds")

  1. Using Databases

Explanation:

Databases provide a structured way to store and retrieve data.

Scripts can insert data into a database, which other scripts can query as

needed.

Why It's Important:

Data Integrity: Enforces data types and constraints.

Concurrency: Allows multiple scripts to access data without conflicts.

Scalability: Handles larger datasets efficiently.

Example from the Project:

Centralized Data Storage:

etl.R

: Inserts cleaned product data into a SQLite database.AM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System

27

# Connect to the SQLite database

conn <- dbConnect(SQLite(), dbname = "closet.db")

# Write data to the 'closet' tabledbWriteTable(conn, "closet", products_clean, append =TRUE, row.names = FALSE)

ootd_api.R

: Queries the database to select items for the outfit.

# Connect to the SQLite databaseconn <- dbConnect(SQLite(), dbname = "closet.db")

# Query for outfit items based on category

outfit_item <- dbGetQuery(conn, "SELECT * FROM closetWHERE category = 'tops' ORDER BY RANDOM() LIMIT 1")Standard Input and Output (Pipes)Explanation: Scripts can read from standard input (stdin) and write to standard output

(stdout

).

Allows chaining commands using pipes (|), where the output of one

command serves as input to another.

Why It's Important: Stream Processing: Useful for processing data streams or large datasets.Flexibility: Enables quick data transformations without intermediate files.Example from the Project:

Chaining Commands (Hypothetical): While not explicitly used in the project, you could use pipes in the commandline:

# Pass the output of one script to another

Rscript script1.R | Rscript script2.RAM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System

28Function Calls Between Scripts (Sourcing)

Explanation:

One script can source another, effectively importing its functions andvariables.In R,source("script.R")runs the code from the sourced script in the currentenvironment.

Why It's Important:

Code Reusability: Share common functions without duplicating code.

Organisation: Keep code modular and maintainable.

Example from the Project:

Shared Functions (Hypothetical): If you have utility functions used across scripts:

 In 'utils.R'

calculate_temperature_category <- function(temp) {

i (temp > 25) {

return("hot")

else if (temp >= 15) {

return("mild")

lse {

return("cold"}

# In 'ootd_api.R'source("utils.R")

# Use the functiontemp_category <- calculate_temperature_category(temperature)Conclusion: Choose the method that best fits the data's nature, the scripts'requirements, and the project's complexity. Combining these methods oftenAM05 AUT24 Final Project Assignment: Outfit Of The Day Recommendation System

29yields the best results in a real-world application.