The chaos of spaghetti scrapers
GOrimpo started as a niche project, a personal tool to help me with something I’m very interested in (retro games). It was meant to be my “tech lab” where I could apply all sorts of technologies. However, in less than a week, my second post about GOrimpo surpassed 70,000 views, and this niche retro gaming project caught the attention of many people!
A common problem with traditional Scrapers is that they tend to have spaghetti code or even a single, massive .py file with thousands of lines. Network code is mixed with business logic, creating a heavy dependency on libraries (if you switch from Playwright to Selenium, you have to rewrite everything) and making it difficult to test without spinning up a browser. Hexagonal Architecture (or Ports & Adapters) is what separates a disposable script from a scalable tool.
The anatomy of GOrimpo
In GOrimpo, I implemented Hexagonal Architecture thinking about expansion to other marketplaces (like Enjoei, Mercado Livre, etc.) and different ways to notify the user (Discord, WhatsApp, and so on). The table below illustrates this architecture:
| Layer | Responsibility |
|---|---|
| Domain (core) | What an offer is, how we filter prices… |
| Ports (interfaces) | Interfaces for Notifier, Scraper… |
| Adapters | Implementations for the ports, such as OLXScraper, Telegram, SQLite |
Theory is great, but where do the files actually live? In GOrimpo, physical separation ensures I don’t mix business logic with infrastructure:
├── cmd
│ └── gorimpo
├── data
├── docs
└── internal
├── adapters
│ ├── config
│ ├── infrastructure
│ ├── notifier
│ ├── repository
│ ├── scraper
│ └── telemetry
└── core
├── domain
├── ports
└── services
True Plug & Play: The power of decoupling
This way, the core has no clue that OLX exists. It knows there is a Scraper, but it doesn’t know if it belongs to OLX, Enjoei, eBay, or any other marketplace—all thanks to our interface. This allows for extremely fast component swapping. If Telegram were to end bot support tomorrow, we would simply create a new DiscordNotifier, implement the interface, and make minor changes in main.go.
olxScraper := scraper.NewOLX(Version != "dev", cfg, idGen)
enjoeiScraper := scraper.NewEnjoei(Version != "dev", cfg, idGen)
meliScraper := scraper.NewMercadoLivre(Version != "dev", cfg, idGen)
ebayScraper := scraper.NewEbay(Version != "dev", cfg, idGen)
// other implementations ...
gorimpoSvc := services.NewGorimpoService(olxScraper, repo, telegram, metrics, cfg)
// any of the scrapers above is accepted!
// gorimpoSvc := services.NewGorimpoService(enjoeiScraper, repo, telegram, metrics, cfg)
// gorimpoSvc := services.NewGorimpoService(meliScraper, repo, telegram, metrics, cfg)
The intention when creating scrapers for other marketplaces is to allow a slice of scrapers, enabling GOrimpo to perform searches across multiple sites in parallel.
As shown in the code above, any implementation is accepted because polymorphism in Go is implemented implicitly through interfaces. As long as the functions are implemented, Go understands it’s a valid interface. No keywords or inheritance required, facilitating flexible and decoupled code.
The first external PR
A PR made by Diego Ritzel (thanks again, Diego!) was living proof that GOrimpo’s hexagonal architecture was easy to understand. Even without reading a CONTRIBUTING.md (which I hadn’t created yet—my bad), he managed to implement a new adapter for Gotify, which will arrive in v1.3.0. Thanks to the ports.Notifier interface, he implemented a new adapter easily with few code changes, and the business logic remained intact, regardless of the notifier being used. Today, in v1.2.0, any new notifier can be implemented simply and safely.
Interfaces in Go
For Diego to create that adapter, we needed the interface implemented. Here is an example of how I implemented the Notifier interface:
package ports
import "github.com/LXSCA7/gorimpo/internal/core/domain"
type Notifier interface {
SetRoutes(routes map[string]string)
Send(offer domain.Offer, category, searchTerm string, showSearchTerm bool) error
SendText(message, category string) error
SendPhoto(data []byte, caption string, category string) error
CreateCategory(name string) (string, error)
}
Looking at the Notification interface today, I see it could be even more segregated. In Go, it’s easy to fall into the temptation of creating “fat” interfaces, but true elegance lies in composition. One of the mapped improvements is breaking the Notifier into smaller pieces (Text, Photo, Category), allowing simpler notifiers not to implement methods they don’t use.
The idea is to do something similar to what was done with the Scraper interface: use Interface Embedding (Interface Composition) in Go—something I only learned after the Scraper interface was already finalized. Take a look at it:
package ports
import "github.com/LXSCA7/gorimpo/internal/core/domain"
type Scraper interface {
Search(term string) ([]domain.Offer, error)
}
type VisualScraper interface {
Scraper
GetLastScreenshot() []byte
}
Instead of polluting the basic Scraper interface with screenshot methods that an API-based scraper wouldn’t know how to handle (as I did with the Notifier), we implement Interface Composition. This allows an adapter to implement whatever is most convenient. The implementation would look like this:
// base
type Notifier interface {
SendText(message, category string) error
}
// for notifiers supporting media (Telegram, Discord)
type VisualNotifier interface {
Notifier
SendPhoto(data []byte, caption string, category string) error
}
// for notifiers managing channels/topics (Telegram, Discord, Slack)
type CategorizedNotifier interface {
Notifier
CreateCategory(name string) (string, error)
}
This significantly improves application decoupling!
Ensuring the contract at compile time
To implement this in Telegram, we can force the compiler to verify that our adapter correctly implements the ports:
var _ ports.Notifier = (*TelegramAdapter)(nil)
var _ ports.CategorizedNotifier = (*TelegramAdapter)(nil)
var _ ports.VisualNotifier = (*TelegramAdapter)(nil)
And to validate whether we can send a photo or not, we can use:
if v, ok := notifier.(ports.VisualNotifier); ok { v.SendPhoto(...) }
This is SOLID in its purest form. Instead of one giant interface, we have small ones. If my scraper is API-based, it satisfies only Scraper. If it’s browser-based and I want to take a screenshot of an error, it satisfies VisualScraper. The Core decides how to use each one without ever knowing who is behind them.
Conclusion
GOrimpo is still at v1.2.0 and has a long way to go. The lessons learned with VisualScraper are already being applied to the refactoring of Notifiers in v1.3.0.
In the end, architecture isn’t about following a manual religiously, but about building something that doesn’t make you afraid to change it six months later. How have you been protecting your core from the mess of scrapers?
Let’s chat over at the GOrimpo repository!