QGato v0.6.0 - Getting ready for large-scale usage

We’ve been gearing up QGato for large-scale usage with a host of improvements across proxy support, search functionality, and a brand-new experimental indexing feature. This release brings us one step closer to a robust and scalable system, moving beyond a simple meta/proxy-search engine.

Introducing Experimental Indexing & Crawler Enhancements

In this update, we’ve introduced experimental indexing. Previously, QGato operated solely as a meta/proxy-search engine, which can be unstable when page structures of search-engines change. While we still rely on multiple search engines as a backup, there’s no guarantee they will always work flawlessly. To mitigate this, we’ve implemented a basic self-crawling and indexing system that works directly on the server. Despite its simplicity, some of my early testing shows that this approach is quite promising.

Getting search results using Indexing diagram

Here’s how our new search and indexing process works:

  • Step 1: The script downloads a list of the top 10M most visited websites.
  • Step 2: Crawlers sequentially visit these websites, attempting to retrieve relevant information.
  • Step 3: The retrieved data is fed into the indexer, which then generates text search results from the gathered information.

This self-crawling mechanism not only speeds up the process significantly compared to crawling external results, but it also lays the foundation for potentially replacing meta-search entirely, once the system becomes robust enough.

Here’s how Meta-Search works (for comparison):

Getting search results using Meta-search diagram

New About QGato & Privacy Policy

In the “mini-menu”, there is now a new button that opens a simple About QGato window. This window includes two buttons: one to view the QGato source code and another leading to the new Privacy Policy page.

Getting search results using Meta-search diagram

The new About QGato page is designed to clearly state that we do not collect any user data and that the project is released under the AGPL-3.0 license. For additional transparency, the page also provides a list of all cookies used by the website, along with short descriptions of their purposes.

Getting search results using Meta-search diagram

Under the hood changes

Enhanced Proxy & Search Capabilities

  • SOCKS5 Proxy Support: Now you can route your requests through a SOCKS5 proxy.
  • Improved Search:
  • Fixed the issue where no images were being retrieved from Quant due to incorrect user agents.
  • Resolved problems with Google text search.
  • Updated SearXNG search integration for more reliable results.

Improved Testing & Stability

  • Test Enhancements: Extended server startup wait time in tests, to avoid test fails due to long startup times.
  • Run Script Cleanup: Removed deprecated ‘build mode’ from run scripts.

Security & Privacy Boosts

  • Secure Cookies: We’ve introduced secure cookie settings to bolster user security.
  • Privacy Policy & About Section: Dedicated pages now inform users about our privacy practices.
  • Crash Fixes: Crashes when the indexer is disabled have been resolved, and directory checks have been improved.

Update Log Summary

  • Introduced experimental indexing with self-crawling: Downloads a list of the top 10M websites, crawls them sequentially, and feeds the extracted data into an indexer to generate text search results.
  • Laid the groundwork for potentially replacing meta-search entirely with the new indexing approach.
  • Enhanced proxy support with SOCKS5 to improve search reliability and avoid rate-limiting.
  • Fixed issues with image retrieval from Quant by correcting user agents.
  • Resolved problems with Google text search and updated SearXNG integration for more dependable results.
  • Removed ‘build mode’ from run scripts and refined tests for better reliability.
  • Improved security with secure cookies.
  • Added dedicated Privacy Policy page.
  • Added ‘About QGato’ pop-up.
  • Fixed crashes when the indexer is disabled and improved directory checks.
  • Extended server startup wait time in tests, to avoid test fails due to long startup times.

Stay tuned as we continue to push QGato towards delivering a faster, more secure, and scalable experience!