Bots are pretty cool.

Photo taken while hanging out in the Arts District.

View the whole project on GitHub here.

 

Background

Automation algorithms, or bots, are one of my favorite things to create. On the surface, the tasks they complete may seem simple, but a closer look at the code reveals their complexity. I didn’t fully appreciate this until I created my first bot.

The bot I developed here performs a basic task, but it also provides the foundation for more advanced applications, such as scraping the web for sentiment analysis on trending topics or collecting and saving large datasets. It also addresses a growing business need: report automation.

Inspiration

The inspiration for this project came when I was searching for an ideal Excel dataset to practice data analysis. I wanted a workbook with multiple sheets to apply lookup formulas and create pivot tables. After failing to find a dataset that fit my needs, I came across a tutorial by Mo Chen, one of my favorite analyst influencers, who had created a dataset specifically for building an Excel portfolio.

After practicing with his dataset, I had an idea: why not create a bot to automate the entire process? The bot would replicate the steps I had just completed manually. Its first task would be to:

  1. Open YouTube, search for the video, follow the link to the spreadsheet, and download the file.

This sparked the concept for a second, more business-oriented application. Having worked in a role that required frequent report generation, I often wondered why more reporting tasks weren’t automated. This inspired the bot’s second task:

2. Modify the downloaded worksheet by filling in missing data values using Excel formulas.


Development

Once I saved the Excel workbook with formulas and data, I opened a new Jupyter notebook to begin coding. The web-scraping script handled the initial task of searching YouTube, navigating to the spreadsheet link, and downloading the file.

Once completed, I containerized and deployed the bot using Docker and AWS to ensure scalability and ease of use.

 

The next step was automating the process of opening the workbook, filling in missing data using Excel formulas, and saving the updated file. You can find the full code for this bot here.

Results and Reflection

After completing both bots, I measured their performance. The Excel-processing bot took around 5 minutes to complete its task, and I plan on recreating the same process using a faster library like pandas to see if I can improve the time.

This project taught me the importance of balancing automation with efficiency. While the bots successfully automated their respective tasks, the results revealed opportunities for optimization. I plan to explore these in future projects, including a comparison of data processing times between Excel and pandas.

For now, I’m signing off with a sense of accomplishment and a reminder that every challenge is an opportunity to grow.

- Mali

Web-scraping bot

Excel-processing bot

Data file credit: Mo Chen

Previous
Previous

Gallery of Graphs: Coffee Sales Data